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MEMBRANE-ASSOCIATED AND SECRETED PROTEINS 
AND USES THEREOF 



This application claims priority to co-pending U,S» Application No. 09/345,464, 
filed June 30, 1999, the entire contents of which are incorporated herein by reference in its 
^ entirety. 

Background of the Invention 

Many secreted proteins, for example, cytokines, play a vital role in the regulation of 
cell growth, cell differentiation, and a variety of specific cellular responses. A number of 

10 medically useful proteins, including erythropoietin, granulocyte-macrophage colony 

stimulating factor, human growth hormone, and various interleukins, are secreted proteins. 

Many membrane-associated proteins are receptors which bind a ligand and 
transduce an intracellular signal, leading to a variety of cellular responses. The 
identification and characterization of such a receptor enables one to identify both the 

15 ligands which bind to the receptor and the intracellular molecules and signal transduction 
pathways associated with the receptor, permitting one to identify or design modulators of 
receptor activity, e.g., receptor agonists or antagonists and modulators of signal 
transduction. 

Thus, an important goal in the design and development of new therapies is the 
20 identification and characterization of membrane-associated and secreted proteins and the 
genes which encode them. 

Summary of the Invention 

The present invention is based, at least in part, on the discovery of cDNA molecules 
25 encoding INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378 all of which are either wholly secreted or transmembrane 
proteins. These proteins, fragments, derivatives, and variants thereof are collectively 
referred to as "polypeptides of the invention" or "proteins of the invention." Nucleic acid 
molecules encoding the polypeptides or proteins of the invention are collectively referred to 
30 as "nucleic acids of the invention." 

The nucleic acids and polypeptides of the present invention are useful as modulating 
agents in regulating a variety of cellular processes. Accordingly, in one aspect, this 
invention provides isolated nucleic acid molecules encoding a polj^eptide of the invention 
or a biologically active portion thereof The present invention also provides nucleic acid 
35 molecules which are suitable for use as primers or hybridization probes for the detection of 
nucleic acids encoding a polypeptide of the invention. 
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The invention features nucleic acid molecules which are at least 45% (or 55%, 65%, 
75%, 85%, 95%, or 98%) identical to the nucleotide sequence of SEQ ID NOs: 1,3,4, 6, 7, 
9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the nucleotide sequence of the 
cDNA insert of a clone deposited with ATCC® as Accession Number 207178 (the "cDNA 
of ATCC® Accession Number 207178"), the nucleotide sequence of the cDNA insert of a 
^ clone deposited with ATCC® as Accession Number PTA-249 (the "cDNA of ATCC® 
Accession Number PTA-249"), or the nucleotide sequence of the cDNA insert of a clone 
deposited with ATCC® as Accession Number PTA-250 (the "cDNA of ATCC® Accession 
Number PTA-250")> or a complement thereof 

The invention features nucleic acid molecules which include a fragment of at least 
300 (325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 800, 900, 1000, 1200, 1400, 
1600, 1 800, 2000, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 3800, or 4000) nucleotides of 
the nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 
24, 25, 27, 28 or 30, the nucleotide sequence of the cDNA of ATCC® Accession Number 
207178, the nucleotide sequence of the cDNA of ATCC® Accession Number PTA-249, or 
the nucleotide sequence of the cDNA of ATCC® Accession Number PTA'250, or a 
complement thereof 

The invention also features nucleic acid molecules which include a nucleotide 
sequence encoding a protein having an amino acid sequence that is at least 45% (or 55%, 
65%, 75%, 85%, 95%, or 98%) identical to the amino acid sequence of SEQ ID N0s:2, 5, 8, 
11, 14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-250. 

In preferred embodiments, the nucleic acid molecules have the nucleotide sequence 
2^ of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the 
nucleotide sequence of the cDNA of ATCC® Accession Number 207178, the nucleotide 
sequence of the cDNA of ATCC® Accession Number PTA-249, or the nucleotide sequence 
of the cDNA of ATCC® Accession Number PTA-250, or a complement thereof 

Also within the invention are nucleic acid molecules which encode a fragment of a 
polypeptide having the amino acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, 
or 29, or a fragment including at least 15 (25, 30, 50, 100, 150, 300, 400, 500, 600, 700, 
800, 900, 1000, 1100, 1200, 1300, or 1400) contiguous amino acids of SEQ ID NOs:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-250, 
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The invention includes nucleic acid molecules which encode a naturally occurring 
allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NOs:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC® 

^ Accession Number PTA-250, wherein the nucleic acid molecule hybridizes to a nucleic acid 
molecule consisting of a nucleic acid sequence encoding SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 
20, 23, 26, or 29, the nucleotide sequence of the cDNA of ATCC® Accession Number 
207178, the nucleotide sequence of the cDNA of ATCC® Accession Number PTA-249, or 
the nucleotide sequence of the cDNA of ATCC® Accession Number PTA-250, or a 

^ ^ complement thereof under stringent conditions. 

Also within the invention are isolated polypeptides or proteins having an amino acid 
sequence that is at least about 60%, preferably 65%, 75%, 85%, 95%, or 98% identical to 
the amino acid sequence of SEQ ID N0s;2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, the amino 
acid sequence encoded by the cDNA of ATCC® Accession Number 207178, the amino acid 
sequence encoded by the cDNA of ATCC® Accession Number PTA-249, or the amino acid 
sequence encoded by the cDNA of ATCC® Accession Number PTA-250. 

Also within the invention are isolated polypeptides or proteins which are encoded by 
a nucleic acid molecule having a nucleotide sequence that is at least about 60%, preferably 
65%, 75%, 85%, or 95% identical the nucleic acid sequence encoding SEQ ID NOs:2, 5, 8, 
.11, 14, 17, 20, 23, 26, or 29, and isolated polypeptides or proteins which are encoded by a 
nucleic acid molecule having a nucleotide sequence which hybridizes under stringent 
hybridization conditions to a nucleic acid molecule having the nucleotide sequence of SEQ 
ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or 
complement thereof, the non-coding strand of the cDNA of ATCC® Accession Number 
207 1 78, the non-coding strand of the cDNA of ATCC® Accession Number PTA-249, or the 
non-coding strand of the cDNA of ATCC® Accession Number PTA-250* 

Also within the invention are polypeptides which are naturally occurring allelic 
variants of a polypeptide that includes the amino acid sequence of SEQ ID N0s:2, 5, 8, 1 1, 
14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-250, wherein the polypeptide is encoded by a nucleic acid 
molecule which hybridizes to a nucleic acid molecule having the sequence of SEQ ID 
NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a complement 
thereof, under stringent conditions. Such allelic variant differ at 1%, 2%, 3%, 4%, or 5% of 
the amino acid residues. 
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The invention also features nucleic acid molecules that hybridize under stringent 
conditions to a nucleic acid molecule having the nucleotide sequence of SEQ ID NOis: 1 , 3, 
4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the cDNA of ATCC® 
Accession Number 207178, the cDNA of ATCC® Accession Number PTA-249, or the 
cDNA of ATCC® Accession Number PTA-250, or a complement thereof. In other 
^ embodiments, the nucleic acid molecules are at least 300 (325, 350, 375, 400, 425, 450, 
500, 550, 600, 650, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600, 
2800, 3000, 3200, 3400, 3600, 3800, 4000, or 4200) nucleotides in length and hybridize 
under stringent conditions to a nucleic acid molecule consisting of the nucleotide sequence 
of SEQ IDNOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the 
cDNA of ATCC® Accession Number 207178, the cDNA of ATCC® Accession Number 
PTA-249, or the cDNA of ATCC® Accession Number PTA-250, or a complement thereof. 

In other embodiments, the isolated nucleic acid molecules encode an extracellular, 
transmembrane, or cytoplasmic domain of a polypeptide of the inveiition. 

In another embodiment, the invention provides an isolated nucleic acid molecule 
which is antisense to the coding strand of a nucleic acid of the invention. 

Another aspect of the invention provides vectors, e,g.^ recombinant expression 
vectors, comprising a nucleic acid molecule of the invention. In another embodiment, the 
invention provides host cells containing such a vector or a nucleic acid molecule of the 
invention. The invention also provides methods for producing a polypeptide of the 

'20 

mvention by culturing, in a suitable medium, a host cell of the invention containing a 
recombinant expression vector such that a polypeptide is produced. 

Another aspect of this invention features isolated or recombinant proteins and 
polypeptides of the invention. Preferred proteins and polypeptides possess at least one 
biological activity possessed by the corresponding naturally-occurring human polypeptide. 
An activity, a biological activity, or a functional activity of a polypeptide or nucleic acid of 
the invention refers to an activity exerted by a protein, polypeptide or nucleic acid molecule 
of the invention on a responsive cell as determined in vivo, or in vitro, according to standard 
techniques. Such activities can be a direct activity, such as an association with or an 
enzymatic activity on a second protein or an indirect activity, such as a cellular signaling 

30 

activity mediated by interaction of the protein with a second protein. 

In one embodiment, the isolated polypeptide of the invention lacks both a 
transmembrane and a cytoplasmic domain. In another embodiment, the polypeptide lacks 
both a transmembrane domain and a cytoplasmic domain and is soluble under physiological 
conditions. 

•^^ For INTERCEPT 340, biological activities include, e.g., (1) the ability to form 

protein-protein interactions with proteins in the signaling pathway of the naturally- 
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occulting polypeptide; (2) the ability to bind a ligand of the naturally-occurring 
polypeptide; (3) the ability to interact with an INTERCEPT 340 receptor, e.g., a cell surface 
receptor (e.g., an integrin); (4) the ability to modulate the activity of an intracellular 
molecule that participates in a signal transduction pathway, e.g., an intracellular molecule in 
the integrin signalling (e.g., a cdk2 inhibitor); (5) the ability to assemble into fibrils; (6) the 
^ abihty to strengthen and organize the extracellular matrix; (7) the ability to modulate the 
shape of tissues and cells; (8) the ability to interact with (e.g., bind to) components of the 
extracellular matrix; and (9) the ability to modulate cell migration. Other activities include 
the ability to modulate function, survival, morphology, migration, proliferation and/or 
differentiation of cells of tissues in which it is expressed (e.g., splenic cells). For example, 
additional biological activities of INTERCEPT 340 include: (1) the ability to modulate 
splenic cell activity; (2) the ability to modulate skeletal morphogenesis; and/or (3) the 
ability to modulate smooth muscle cell proliferation and differentiation. 

For MANGO 003, biological activities include, e.g., (1) the ability to fomi protein- 
protein (e.g*, protein- ligand) interactions with proteins in the signaling pathway of the 
naturally-occurring polypeptide; (2) the ability to interact with (eg., bind to) a ligand of the 
naturally-occurring polypeptide; (3) the ability to interact with a MANGO 003 receptor, 
e.g., a cell surface receptor; (4) the ability to modulate cell surface recognition; (5) the 
ability to transduce an extracellular signal (eg., by interacting with a ligand and/or a cell- 
siirface receptor); (6) the ability to modulate a signal transduction pafliway; and (7) the 

20 

ability to modulate signal transmission at a chemical synapse. Other activities include the 
ability to modulate jfunction, survival, moiphology, proliferation and/or differentiation of 
cells of tissues in which it is expressed (eg., thyroid, liver, skeletal muscle, kidney, heart, 
lung, testis and brain). For example, the activities of MANGO 003 can include modulation 
of endocrine, hepatic, skeletal muscular, renal, cardiovascular, reproductive and/or brain 
function. 

For MANGO 347, biological activities include, eg., (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the naturally-occurring 
polypeptide; (2) the ability to interact with a ligand of the naturally-occurring polypeptide; 
(3) the ability to interact with a MANGO 347 receptor; and (4) the ability to modulate a 
developmental process, eg., morphogenesis, cellular migration, adhesion, proliferation, 
differentiation, and/or survival Other activities include the ability to modulate function, 
survival, morphology, proliferation and/or differentiation of cells of tissues in which it is 
expressed (e.g., brain cells). For example, the activities of MANGO 347 can include 
modulation of neural (e.g., CNS) function. 

For TANGO 272, biological activities include, e.g., (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the natiurally-occurring 
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pol3^eptide; (2) the ability to bind a ligand of the naturally-occurring polypeptide; (3) the 
ability to interact with a TANGO 272 receptor, e.^., a cell surface receptor (eg., an 
integriri); (4) the ability to modulate cell-cell contact; (5) the ability to modulate cell 
attachment; (6) the ability to modulate cell fate; and (7) the ability to modulate tissue repair 
and/or wound healing. Other activities include the ability to modulate function, survival, 
^ morphology, proliferation and/or differentiation of cells of tissues in which it is expressed 
(e,g.y microvascular endothelial cells). For example, the activities of MANGO 347 can 
include modulation of cardiovascular function. 

For TANGO 295, biological activities include, e.g,, (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the naturally-occurring 
polypeptide; (2) the ability to bind a ligand of the naturally-occurring polypeptide; (3) the 
ability to interact with a TANGO 295 receptor; (4) the ability to interact with (e.g., bind to) 
a nucleic acid; and (5) the ability to elicit pyrimidine-specific endonuclease activity. Other 
activities include the ability to modulate function, survival, morphology, proliferation 
and/or differentiation of cells of tissues iii which it is expressed (e.g., mammary 
epithelium). 

For TANGO 354, biological activities include, e.g., (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the naturally-occurring 
polypeptide; (2) the ability to bind a ligand of the naturally-occurring polypeptide; (3) the 
ability to interact with (eg., bind to) a TANGO 354 receptor, eg., a cell surface receptor; 
(4) the ability to modulate cell surface recogmtion; (5) the abihty to modulate cellular 
motiUty, eg., chemotaxis and/or chemokinesis; (6) the ability to transduce an extracellular 
signal (eg., by interacting with a ligand and/or a cell-surface receptor); and (7) the ability to 
modulate a signal transduction pathway. Other activities include the ability to modulate 
function, survival, morphology, proliferation and/or differentiation of cells of tissues in 
which it is expressed (eg., hematopoietic tissues). For example, TANGO 354 biological 
activities can further include: (1) regulation of hematopoiesis; (2) modulation (e.g., 
increasing or decreasing) of haemostasis; (3) modulation of an inflammatory response; (4) 
modulation of neoplastic growth, eg., inhibition of tumor growth; and (5) modulation of 
thrombolysis. 

For TANGO 378, biological activities include, e.g. , (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the naturally-occurring 
polypeptide; (2) the ability to bind a ligand of the naturally-occurring polypeptide; (3) the 
ability to interact with a TANGO 378 receptor; (4) the ability to transduce an extracellular 
signal; and (5) the ability to modulate a signal transduction pathway (eg., adenylate 
cyclase, or phosphatidylinositol 4,5-bisphosphate (PIP2), inositol 1,4,5-triphosphate (IPg)). 
Other activities include the ability to modulate function, survival, morphology, proliferation 
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and/or difFerentiation of cells of tissues in which it is expressed natural killer cells). 
For example, TANGO 378 biological activities can further include the ability to modulate 
an immune response in a subject, for example, (1) by modulating immune cytotoxic 
responses against pathogenic organisms, e,g.y viruses, bacteria, and parasites; (2) by 
modulating organ rejection after transplantation; and (3) by modulating immune recognition 
^ and lysis of normal and malignant cells. 

In one embodiment, a polypeptide of the invention has an amino acid sequence 
sufficiently identical to an identified domain of a polypeptide of the invention. As used 
herein, the term '^sufficiently identical" refers to a first amino acid or nucleotide sequence 
which contains a sufficient or minimum number of identical or equivalent (e.g., with a 
similar side chain) amino acid residues or nucleotides to a second amino acid or nucleotide 
sequence such that the first and second amino acid or nucleotide sequences have a common 
structural domain and/or common functional activity. For example, amino acid or 
nucleotide sequences which contain a conunon structural domain having about 60% 
identity, preferably 65% identity, more preferably 75%, 85%, 95%, 98% or more identity 
^ ^ are defined herein as sufficiently identical. 

In one embodiment, a MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, or TANGO 378 polypeptide of the invention includes a signal peptide. 

In another embodiment, a nucleic acid molecule of the invention encodes a 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 378 
polypeptide which mcludes a signal peptide. 

In another embodiment, a MANGO 003, TANGO 272, TANGO 354, or TANGO 
378 polypeptide of the invention includes one or more of the following domains: (1) a 
signal peptide; (2) an N-terminal extracellular domain; (3) a C-terminal transmembrane 
domain; and (4) a cytoplasmic domain. 

The polypeptides of the present invention, or biologically active portions thereof, 
can be operably linked to a heterologous amino acid sequence to form fusion proteins. In 
one embodiment, the fusion protein consists of a chimeric protein assembled from portions 
of the protein fi-om different species. 

In one embodiment, the isolated polypeptide of the invention lacks both a 
" transmembrane and a cytoplasmic domain. In another embodiment, the polypeptide lacks 
both a transmembrane domain and a cytoplasmic domain and is soluble under physiological 
conditions. 

The invention fiirfher features antibodies that specifically bind a polypeptide of the 
invention such as monoclonal or polyclonal antibodies. In addition, the polypeptides of the 
invention or biologically active portions thereof, or antibodies of the invention, can be 
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incorporated into pharmaceutical compositions, which optionally include pharmaceutically 
acceptable carriers* 

In another aspect, the present invention provides methods for detecting the presence 
of the activity or expression of a polypeptide of the invention in a biological sample by 
contacting the biological sample with an agent capable of detecting an indicator of activity 
^ such that the presence of activity is detected in the biological sample. 

In another aspect, the invention provides methods for modulating activity of a 
polypeptide of the invention comprising contacting a cell with an agent that modulates 
(inhibits or stimulates) the activity or expression of a polypeptide of the invention such that 
activity or expression in the cell is modulated. In one embodiment, the agent is an antibody 
that specifically binds to a polypeptide of the invention. 

In another embodiment, the agent modulates expression of a polypeptide of the 
invention by modulating transcription, splicing, or translation of an mRNA encoding a 
polypeptide of the invention. In yet another embodiment, the agent is a nucleic acid 
molecule having a nucleotide sequence that is antisense to the coding strand of an mRNA 
encoding a polypeptide of the invention. 

The present invention also provides methods to treat a subject having a disorder 
characterized by aberrant activity of a polypeptide of the invention or aberrant expression of 
a nucleic acid of the invention by administering an agent which is a modulator of the 
activity of a polypeptide of the invention or a modulator of the expression of a nucleic acid 
of the invention to the subject. In one embodiment, the modulator is a protein of the 
invention. In another embodiment, the modulator is a nucleic acid of the invention. In 
other embodiments, the modulator is a peptide, peptidomimetic, or other small organic 
molecule. The present invention also provides diagnostic assays for identifying the presence 
or absence of a genetic lesion or mutation characterized by at least one of: (i) aberrant 
modification or mutation of a gene encoding a polypeptide of the invention, (ii) mis- 
regulation of a gene encoding a polypeptide of the invention, and (iii) aberrant post- 
translational modification of the invention wherein a wild-tjTpe form of the gene encodes a 
protein having the activity of the polypeptide of the invention. 

In another aspect, the invention provides a method for identifying a compound that 
binds to or modulates the activity of a polypeptide of the invention. In general, such 
methods entail measuring a biological activity of the polypeptide in the presence and 
absence of a test compound and identifying those compounds which alter the activity of the 
polypeptide. 

The invention also features methods for identifying a compound which modulates 
the expression of a polypeptide or nucleic acid of the invention by measuring the expression 
of the polypeptide or nucleic acid in the presence and absence of the compound. 
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In yet a further aspect, the invention provides substantially purified antibodies or 
fragments thereof including human and non-human antibodies or fragments thereof which 
antibodies or fragments specifically bind to a polypeptide comprising an amino acid 
sequence selected from the group consisting of: the amino acid sequence of SEQ ID N0s:2, 
5, 8, 1 1, 14, 17, 20, 23, 26, or 29 or the amino acid sequence encoded by the cDNA insert of 
^ the plasmid deposited with the ATCC® as Accession Niunber 207178, the amino acid 
sequence encoded by the cDNA insert of the plasmid deposited with the ATCC® as 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA insert of 
the plasmid deposited with the ATCC® as Accession Number PTA-250; a fragment of at 
least 15 amino acid residues of the amino acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 
20, 23, 26, or 29; an amino acid sequence which is at least 95% identical to the amino acid 
sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, wherein the percent identity 
is determined using the ALIGN program of the GCG software package with a PAM120 
weight residue table, a gap length penalty of 12, and a gap penalty of 4; and an amino acid 
sequence which is encoded by a nucleic acid molecule which hybridizes to the nucleic acid 
molecule consisting of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 
25, 27, 28 or 30, under conditions of hybridization of 6X SSC at 45°C and washing in 0.2 X 
SSC, 0.1% SDS at 65°C. In various embodiments, the substantially purified antibodies of 
the invention, or fragments thereof can be human, non-human, chimeric and/or humanized 
antibodies. 

Of) 

Any of llie antibodies of the invention can be conjugated to a therapeutic moiety or 
to a detectable substance. Non-limiting examples of detectable substances that can be 
conjugated to the antibodies of the invention are an enzyme, a prosthetic group, a 
fluorescent material, a luminescent material, a bioluminescent material, and a radioactive 
material. 

The invention also provides a kit containing an antibody of the invention conjugated 
to a detectable substance, and instructions for use. Still another aspect of the invention is a 
pharmaceutical composition comprising an antibody of the invention and a 
pharmaceutically acceptable carrier. In preferred embodiments, the pharmaceutical 
composition contains an antibody of the invention, a therapeutic moiety, and a 

OA 

pharmaceutically acceptable carrier. 

Other features and advantages of the invention will be apparent from the following 
detailed description and claims. 

Brief Description of the Drawings 

Figures lA-lB depict the cDNA sequence of human INTERCEPT 340 (SEQ ID 
N0:1) and the predicted amino acid sequence of INTERCEPT 340 (SEQ ID NO:2). The 
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open reading frame of SEQ ID NO:l extends from nucleotide 1222 to nucleotide 1944 of 
SEQ ID N0:1 (SEQ ID NO:3). 

Figure 2 depicts a hydropathy plot of human INTERCEPT 340. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 

^ glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
INTERCEPT 340 are indicated. The amino acid sequence of each of the fibrillar collagen 
C-terminal domains are indicated by underlining and the abbreviation "COLF**. 

Figure 3 depicts an alignment of each of the fibrillar collagen C-tem\inal domains 
(also referred to herein as "COLF domains") of human INTERCEPT 340 with consensus 
hidden Markov model COLF domains. For each alignment, the upper sequence is the 
consensus amino acid sequence (SEQ ID N0s:31, 32, and 33), while the lower sequence 
amino acid sequence corresponds to amino acid 58 to amino acid 116 of SEQ ID N0:2 
(SEQ ID NO:34), amino acid 126 to amino acid 151 of SEQ ID NO:2 (SEQ ID NO:35), and 

^ ^ amino acid 1 86 to amino acid 2 1 7 of SEQ ID N0:2 (SEQ ID NO:36). 

Figures 4A-4C depict the cDNA sequence of human MANGO 003 (SEQ ID NO:4) 
and the predicted amino acid sequence of MANGO 003 (SEQ ID NO:5). The open reading 
frame of SEQ ID N0:4 extends from nucleotide 57 to nucleotide 1568 of SEQ ID N0:4 
(SEQIDNO:6). 

20 

Figure 5 depicts a hydropathy plot of human MANGO 003. Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophilic residues are below 
the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(Ngly) are indicated by short vertical lines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of MANGO 003 

25 

are indicated. The amino acid sequence of each of the immunoglobulin domains, and the 
neurotransmitter gated ion channel domain are indicated by underlining and the 
abbreviations "ig" and "neur chan'*, respectively. 

Figure 6 depicts an alignment of each of the immunoglobulin donlains (also referred 
to herein as "Ig domains*') of human MANGO 003 with the consensus hidden Markov 
model immunoglobulin domains. For each alignment, the upper sequence is the consensus 
sequence (SEQ ID NO:37), while the lower sequence corresponds to amino acid 44 to 
amino acid 101 of SEQ ID N0:5 (SEQ ID NO:38), amino acid 165 to amino acid 223 of 
SEQ ID N0:5 (SEQ ID NO:39), and amino acid 261 to amino acid 340 of SEQ ID N0:5 
(SEQrDNO:40). 

'ye 

Figure 7 depicts an alignment of the neurotransmitter gated ion channel domain of 
human MANGO 003 with the consensus hidden Markov model neurotransmitter gated ion 
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channel domain. The upper sequence is the consensus sequence (SEQ ID NO:42), while the 
lower sequence corresponds to amino acid 388 amino acid 397 of SEQ ID NO:5 (SEQ ID 
NO:43), 

Figure 8 depicts the cDNA sequence of mouse MANGO 003 (SEQ ID N0:7) and 
the predicted amino acid sequence of MANGO 003 (SEQ ID NO:8). The open reading 
^ frame of SEQ ED NO:7 extends from nucleotide 1 to nucleotide 626 of SEQ ID NO:4 (SEQ 
IDN0:9). 

P depicts a hydropathy plot of mouse MANGO 003. Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophilic residues are below 
the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(Ngly) are indicated by short vertical lines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of mouse MANGO 
003 are indicated. 

Figure JO depicts the cDNA sequence of human MANGO 347 (SEQ ID NO: 10) and 
the predicted amino acid sequence of MANGO 347 (SEQ ID NO:l 1), The open reading 
^ ^ frame of SEQ ID NO: 1 0 extends from nucleotide 3 1 to nucleotide 444 of SEQ ID NO: 1 0 
(SEQ ID NO: 12). 

Figure 11 depicts a hydropathy plot of human MANGO 347. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by 
short vertical lines just below the hydropathy trace. Below the hydropathy plot, the 
numbers corresponding to the amino acid sequence of MANGO 347 are indicated. The 
amino acid sequence of the CUB domain is indicated by tmderlining and the abbreviation 
"CUB". 

Figure 12 depicts an alignment of the CUB domain of human MANGO 347 with a 
consensus hidden Markov model CUB domain. The upper sequence is the consensus amino 
acid sequence (SEQ ID NO:44), while the lower sequence corresponds to amino acid 40 to 
amino acid 1 36 of SEQ ID NO: 1 1 (SEQ ID NO:45). 

Figures ISA-UD depict the cDNA sequence of human TANGO 272 (SEQ ID 
NO: 13) and the predicted amino acid sequence of TANGO 272 (SEQ ID NO: 14). The open 
reading frame of SEQ ID NO: 13 extends from nucleotide 230 to nucleotide 3379 of SEQ ID 
NO:13(SEQIDNO:15). 

Figwre 7-/ depicts a hydropathy plot ofhuman TANGO 272. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
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TANGO 272 are indicated. The amino acid sequence of each of the fourteen EGF-like 
domains and the delta serrate ligand domain is indicated by underlining and the 
abbreviation "EGF-like" and "DSL", respectively. 

Figures 15A'15C depict an alignment of each of the EGF-like domains of human 
TANGO 272 with consensus hidden Markov model EGF-like domains. The upper 
sequence is the consensus amino acid sequence (SEQ ID NOi46), while the lower sequence 
corresponds to amino acid 151 to amino acid 181 of SEQ ID NO:14 (SEQ ID NO:49); 
amino acid 200 to amino acid 229 of SEQ ID N0:14 (SEQ ID NO:50); amino acid 242 to 
amino acid 272 of SEQ ID NO:14 (SEQ ID N0:51); amino acid 285 to amino acid 315 of 
SEQ ID N0:14 (SEQ ID NO:52); amino acid 328 to amino acid 358 of SEQ ID NO:14 
(SEQ ID NO:53); amino acid 378 to amino acid 404 of SEQ ID N0:14 (SEQ ID NO:54); 
amino acid 417 to amino acid 447 of SEQ ID N0:14 (SEQ ID NO:55); amino acid 460 to 
amino acid 490 of SEQ ID N0:14 (SEQ ID NO:56); amino acid 503 to amino acid 533 of 
SEQ ID N0:14 (SEQ ID NO:57); amino acid 546 to amino acid 576 of SEQ ID N0:14 
(SEQ ID NO:58); amino acid 589 to amino acid 619 of SEQ ID NO:14 (SEQ ID NO:59); 
amino acid 632 to amino acid 661 of SEQ ID NO: 14 (SEQ ID NO:60); amino acid 674 to 
amino acid 704 of SEQ ID NO:14 (SEQ ID NO:61); and amino acid 717 amino acid 747 of 
SEQ ID NO: 14 (SEQ ID NO:62). For alignment of the delta serrate ligand domain, the 
upper sequence is the consensus hidden Markov model (SEQ ID NO:47), while the lower 
sequence corresponds to amino acid 518 to amino acid 576 of SEQ ID NO;14 (SEQ ID 
NO:63). 

Figures 16A-16B depict the cDNA sequence of mouse TANGO 272 (SEQ ID 
NO:l 6) and the predicted amino acid sequence of TANGO 272 (SEQ ID N0:17)» The open 
reading frame of SEQ ID NO:16 extends from nucleotide 1 to nucleotide 1492 of SEQ ID 
NO:16(SEQIDNO:18). 

Figure 1 7 depicts a hydropathy plot of mouse TANGO 272. Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophilic residues are below 
the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(Ngly) are indicated by short vertical lines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of mouse TANGO 
272 are indicated. 

Figure 18 depicts the cDNA sequence of human TANGO 295 (SEQ ID NO:22) and 
the predicted amino acid sequence of TANGO 295 (SEQ ID NO:23). The open reading 
frame of SEQ ID NO:22 extends from nucleotide 217 to nucleotide 684 of SEQ ID NO:28 
(SEQ ID NO:24), 

Figure J9 depicts a hydropathy plot of human TANGO 295. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
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residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
human TANGO 295 are indicated. The amino acid sequence of the pancreatic ribonuclease 
domain is indicated by underhning and the abbreviation "RNase A". 
^ Figure 20 depicts an alignment of the pancreatic ribonuclease domain of human 

TANGO 295 with a consensus hidden Markov model pancreatic ribonuclease domain. The 
upper sequence is the consensus amino acid sequence (SEQ TD NO:96), while the lower 
sequence corresponds to amino acid 32 to amino acid 156 of SEQ ID NO:23 (SEQ ID 
NO:97). 

^ ^ Figures 21A'21B depict the cDNA sequence of human TANGO 354 (SEQ ID 

NO:25) and the predicted amino acid sequence of TANGO 354 (SEQ ID NO:26). The open 
reading frame of SEQ ID NO:25 extends from nucleotide 62 to nucleotide 976 of SEQ ID 
NO:25(SEQIDNQ:27). 

F/gttre 22 depicts a hydropathy plot of human TANGO 354. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the nxutibers corresponding to the amino acid sequence of 
himian TANGO 354 are indicated. The amino acid sequence of the immunoglobulin 
domain is indicated by underlining and the abbreviation "ig'\ 

Figure 23 depicts an alignment of the immunoglobulin domain of human TANGO 
354 with a consensus hidden Markov model immunoglobulin domains. The upper sequence 
is the consensus amino acid sequence (SEQ ID NO:37), while the lower sequence 
corresponds to amino acid 33 to amino acid 110 of SEQ ID NO:26 (SEQ ID N0:41). 

Figures 24A-24C depict the cDNA sequence of human TANGO 378 (SEQ ID 
NO:28) and the predicted amino acid sequence of TANGO 378 (SEQ ID NO:29). The open 
reading frame of SEQ ID NO:28 extends from nucleotide 42 to nucleotide 1625 of SEQ ID 
NO:28(SEQIDNO:30). 

Figure 25 depicts a hydropathy plot of human TANGO 378. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
human TANGO 378 are indicated. The amino acid sequence of the seven transmembrane 
domain is indicated by underlining and the abbreviation **7tm*'. 
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F/gwre 26 depicts an alignment of the seven transmemb 
human TANGO 378 with a consensus hidden Markov model of this domain. The upper 
sequence is the consensus amino acid sequence (SEQ ID NO:98), while the lower sequence 
corresponds to amino acid 1 87 to amino acid 515 of SEQ ID NO:29 (SEQ ID NO:99). 

Figures 27A-27C depict a global alignment between the nucleotide sequence of the 

^ open reading frame (ORF) of human MANGO 003 (SEQ ID N0:6) and the nucleotide 
sequence of the open reading franie of mouse MANGO 003 (SEQ ID N0:9). The upper 
sequencers the human MANGO 003 ORF nucleotide sequence, while the lower sequence is 
the mouse MANGO 003 ORF nucleotide sequence. These nucleotides sequences share a 
3 L 1 % identity. The global alignment was performed using the ALIGN program version 
2»0u (Matrix file used: pam 120,mat, gap penalties of -12/-4 with a global alignment score 
of-1212; Myers and Miller, 1989, 4:1 1-7). 

Figures 28A-28B depict a local alignment between the nucleotide sequence of 
human MANGO 003 (SEQ ID N0:4) and the nucleotide sequence of mouse MANGO 003 
(SEQ ID N0:7). The upper sequence is the human MANGO 003 nucleotide sequence, 
while the lower sequence is the mouse MANGO 003 nucleotide sequence. These 
nucleotides sequences share a 62.8 % identity over nucleotide 970 to nucleotide 2080 of the 
human MANGO 003 sequence (nucleotide 10 to nucleotide 1070 of mouse MANGO 003). 
The local aligtmient was performed using the L- ALIGN program version 2.0u54 July 1 996 
(Matrix file used: pam 120,mat, gap penalties of -12M with a score of 3241; Huang and 

2^ Miller, 1991, Adv. Appl Math. 12:373-381). 

Figure 29 depicts a global alignment between the amino acid sequence of human 
MANGO 003 (SEQ ID N0:5) and the amino acid sequence of mouse MANGO 003 (SEQ 
ID N0:8). The upper sequence is the human MANGO 003 amino acid sequence, while the 
lower sequence is the mouse MANGO 003 amino acid sequence. These amino acid 
sequences share a 30. 1 % identity. The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of -488; Myers and Miller, 1989, 4: 11-7). 

Figures 30A-30E depict a global alignment between the nucleotide sequence of the 
open reading firame (ORF) of human TANGO 272 (SEQ ID NO: 15) and the nucleotide 
sequence of the open reading fi'ame of mouse TANGO 272 (SEQ ID NO: 18). The upper 
sequence is the mouse TANGO 272 ORF nucleotide sequence, while the lower sequence is 
the human TANGO 272 ORF nucleotide sequence. These nucleotides sequences share a 
39. 1% identity. The global alignment was performed using the ALIGN program version 
2.0u (Matrix file used: pam 120 .mat, gap penalties of -12/-4 with a global alignment score 

35 of-79; Myers and Miller, 1989, CABIOS^.U-l). 
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Figures 31A-31D depict a local alignment between the nucleotide sequence of 
human TANGO 272 (SEQ ID NO: 13) and the nucleotide sequence of mouse TANGO 272 
(SEQ ID NO: 16). The upper sequence is the human TANGO 272 nucleotide sequence, 
while the lower sequence is the mouse TANGO 272 nucleotide sequence. These 
nucleotides sequences share a 67.6 % identity over nucleotide 1890 to nucleotide 4610 of 
^ the human TANGO 272 sequence (nucleotide 10 to nucleotide 2560 of mouse TANGO 
272). The local alignment was performed using the L-ALIGN program version 2.0u54 July 
1996 (Matrix file used: pam 120*mat, gap penalties of -12/-4 with a score of 8462; Huang 
and Miller, 1991, ^tfv. ^pp/. Ma^ft. 12:373-381). 

Figures 32A-32B depict a global alignment between the amino acid sequence of 
human TANGO 272 (SEQ ID NO: 14) and the amino acid sequence of mouse TANGO 272 
(SEQ ID NO: 1 7). The upper sequence is the human TANGO 272 amino acid sequence, 
while the lower sequence is the mouse TANGO 272 amino acid sequence. These amino 
acid sequences share a 38.2% identity. The global alignment was performed using the 
ALIGN program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a 
^ ^ global alignment score of -1 9; Myers and Miller, 1 989, CABIOS 4:11 -7). 

Figures 33A''33D depict the cDNA sequence of rat TANGO 272 (SEQ ID NO: 1 9) 
and the predicted amino acid sequence of TANGO 272 (SEQ ID NO:20). The open reading 
fi-ame of SEQ ID NO: 19 extends from nucleotide 925 to nucleotide 2832 of SEQ ID NO: 19 
(SEQIDNO:21). 

Figures 34A''34H depict a global alignment between the nucleotide sequence of 
human TANGO 272 (SEQ ID NO: 13) and the nucleotide sequence of rat TANGO 272 
(SEQ ID NO: 19). The upper sequence is the human TANGO 272 nucleotide sequence, 
while the lower sequence is the rat TANGO 272 nucleotide sequence. These nucleotides 
sequences share a 55.7% identity. The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120 .mat, gap penalties of -12/-4 with a global 
alignment score of 8635; Myers and Miller, 1989, G15/054:l 1-7). 

Figures 35A-35F depict a global alignment between the nucleotide sequence of 
mouse TANGO 272 (SEQ ID N0:16) and the nucleotide sequence of rat TANGO 272 
(SEQ ID NO: 19). The upper sequence is the mouse TANGO 272 nucleotide sequence, 

30 

while the lower sequence is the rat TANGO 272 nucleotide sequence. These nucleotides 
sequences share a 43.7% identity. The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of 2827; Myers and Miller, 1989, CABIOS 4: 1 1-7). 

Figure 36 depicts a global alignment of the human TANGO 295 and GenPept 
AP037081 amino acid sequences. The upper sequence is the human TANGO 295 sequence 
(SEQ ID NO:23), while the lower sequence is the GenPept AF03 7081 sequence (SEQ ID 
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NO: 1 00). GenPept At03708 1 encodes a ribonuclease k6 protein. The global alignment 
revealed a 53.2% identity between these two sequences (Matrix file used: pam 120.mat, gap 
penalties of -12/-4 with a global alignment score of 405; Myers and Miller, 1989, CABIOS 
4:11-7). 

Figures 37A-37C depict a global alignment of the human TANGO 295 (SEQ ID 
^ NO:22) and GenPq)t AF037081 (SEQ ID NO:100) nucleotide sequences. The upper 
sequence is the human TANGO 295 sequence, while the lower sequence is the GenPept 
AF037081 sequence. The global alignment revealed a 22.6% identity between these two 
sequences (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global alignment 
score of -2718; Myers and Miller, 1989, CABIOS A^U-l). 

Figures 38A-38B depict a local alignment of the human TANGO 295 (SEQ ID 
NO:22) and GenPept AF037081 (SEQ ID NO:100) nucleotide sequences. The upper 
sequence is the human TANGO 295 sequence, while the lower sequence is the GenPept 
AF03708I sequence. The local alignment revealed a 62.7% identity between nucleotide 
235 to nucleotide 687 of human TANGO 295, and nucleotide 3 to nucleotide 453 of 
AF037081; 43.4% identity between nucleotide 410 to nucleotide 850 of human TANGO 
295, and nucleotide 3 to nucleotide 450 of AF037081 ; and 46.5% identity between 
nucleotide 432 to nucleotide 700 of human TANGO 295, and nucleotide 5 to nucleotide 251 
of AF037081 (Matrix file used: pam 120,mat, gap penalties of -12/-4 with a global 
alignment score of 1214; Huang and Miller, 1991, ^rfv. Appl Math. 12:373-381), 

Figures 39A'39B depict an alignment of each of the EGF-like domains and laminin- 
EGF-like domains of mouse TANGO 272 with consensus hidden Markov model BGF-like 
domains. For alignments of the EGF-like domains, the upper sequence is the consensus 
amino acid sequence (SEQ ID NO:46), while the lower sequence corresponds to amino 
acids 37-67 of SEQ ID NO: 17 (SEQ ID NO:64); amino acid 80 to amino acid 110 of SEQ 
ID NO: 17 (SEQ ID NO:65); amino acid 123 to amino acid 153 of SEQ ID NO: 17 (SEQ ID 
NO:66); and amino acid 166 to amino acid 196 of SEQ ID NO:17 (SEQ ID NO:67). For 
alignments of the laminin/EGF-like domains, the upper sequence is the consensus hidden 
Markov model domain (SEQ ID NO:48), while the lower sequence corresponds to amino 
acid 3 to amino acid 37 of SEQ ID NO: 17 (SEQ ID NO:68); amino acid 41 to amino acid 
2^ 80 of SEQ ID NO:17 (SEQ ID NO:69); amino acid 83 to amino acid 123 of SEQ ID NO:17 
(SEQ ID NO:70); and amino acid 127 to amino acid 172 of SEQ ID NO:17 (SEQ ED 
N0:71). For alignment of the delta serrate ligand domain, the upper sequence is the 
consensus hidden Markov model domain (SEQ ID N0:47), while the lower sequence 
corresponds to amino acid 10 to amino acid 67 of SEQ ID NO: 1 7 (SEQ ID NO:72). 

Figure 40 depicts a hydropathy plot of rat TANGO 272. Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophilic residues are below 
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the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(Ngly) are indicated by short vertical lines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of rat TANGO 272 
are indicated. 

Figures 41A-41D depict an alignment of each of the EGF-like domains and laminin- 
EGF-like domains of rat TANGO 272 with consensus hidden Markov model of EGF-like 
domains. For alignments of the EGF-like domains, the upper sequence is the consensus 
amino acid sequence (SEQ ID NO:46), while the lower sequence corresponds to amino acid 
18 to amino acid 48 of SEQ ID NO:20 (SEQ ID NO:73); amino acid 61 to amino acid 91 of 
SEQ ID NO:20 (SEQ ID NO:74); amino acids 105-137 of SEQ ID NO:20 (SEQ ID 
NO:75); amino acids 150-180 of SEQ ID NO:20 (SEQ ID NO:76); amino acids 193-223 of 
SEQ ID NO:20 (SEQ ID NO:77); amino acids 236-266 of SEQ ID NO:20 (SEQ ID 
NO:78); amino acids 279-309 of SEQ ID NO:20 (SEQ ID NO:79); amino acids 322-352 of 
SEQ ID NO:20 (SEQ ID NO:80); amino acids 365-394 of SEQ ID NO:20 (SEQ ID 
N0:81); amino acids 407-437 of SEQ ID NO:20 (SEQ ID NO:82); and amino acids 450- 
480 of SEQ ID NO:20 (SEQ ID NO:83). For alignments of the laminin/EGF-like domains, 
the upper sequence is the consensus hidden Markov model domain (SEQ ID NO:48), while 
the lower sequence corresponds to amino acids 22-61 of SEQ ID NO:20 (SEQ ID NO:84); 
amino acids 65-105 of SEQ ID NO:20 (SEQ ID NO:85); amino acids 109-150 of SEQ ID 
NO:20 (SEQ ID NO:86); amino acids 154-193 of SEQ ID NO:20 (SEQ ID NO:87); amino 
acids 197-236 of SEQ ID NO:20 (SEQ ID NO:88); amino acids 240-279 of SEQ ID NO:20 
(SEQ ID NO:89); amino acids 283-322 of SEQ ID NO:20 (SEQ ID NO:90); amino acids 
326-365 of SEQ ID NO:20 (SEQ ID N0:91); amino acids 368-407 of SEQ ID NO:20 (SEQ 
ID NO:92); amino acids 41 1-450 of SEQ ID NO:20 (SEQ ID NO:93); and amino acids 454- 
489 of SEQ ID NO:20 (SEQ ID NO:94), For alignment of the deha serrate ligand domain, 
the upper sequence is the consensus hidden Markov model domain (SEQ ID NO:47), while 
the lower sequence corresponds to amino acids 246-309 of SEQ ID NO:20 (SEQ ID 
NO:95). 

Detailed Description of the Invention 

The present invention is based, at least in part, on the discovery of cDNA molecules 
encoding INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378, all of which are either wholly secreted or transmembrane 
proteins. 

The proteins and nucleic acid molecules of the present invention comprise a family 
of molecules having certain conserved structural and functional features* As used herein, 
the term '^family" is intended to mean two or more proteins or nucleic acid molecules 
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having a common structural domain and having sufficient amino acid or nucleotide 
sequence identity as defined herein. Family members can be from either the same or 
different species. For example, a family can comprise two or more proteins of human 
origin, or can comprise one or more proteins of human origin and one or more of non- 
human origin. Members of the same family may also have common structural domains. 

For example, INTERCEPT 340 family members can include at least one, preferably 
two, and more preferably three fibrillar collagen G-terminal domains (also referred to herein 
as "COLF domains"). As used herein, a "fibrillar collagen C-terminal domain" refers to an 
amino acid sequence of about 15 to 65, preferably about 20-60, more preferably about 25, 
31-58 amino acids in length. Consensus hidden Markov model COLF domains contain the 
sequence of SBQ ED NOs:31, 32, and 33 (Figure 3). The more conserved residues in the 
consensus sequence are indicated by uppercase letters and the less conserved residues in the 
consensus sequence are indicated by lowercase letters. A comparison of the C-terminal 
sequences of fibrillar coUagens/collagens X, Vin, and the collagen Clq revealed a 
conserved cluster of amino acid residues having aromatic side chains (e^g., tyrosine, 
phenylalanine, tryptophan, histidine) that exhibited marked similarities in hydrophilicity 
profiles between the different collagens, despite a low level of sequence similarity. These 
similarities in hydrophilicity profiles within their C-termini suggest that these proteins may 
adopt a common tertiary structure and that the conserved cluster of aromatic residues in this 
domain may be involved in C-termirial trimerization. The COLF domains of INTERCEPT 
340 extend from about amino acids 58 to 116, 126 to 151, and 186 to 217 of SEQ ID NO:2 
(SEQ ID NOs:34, 35, and 36, respectively) (Figure 3). By alignment of the amino acid 
sequence of the consensus hidden Markov model COLF amino acid sequence with the 
amino acid sequence of the COLF domains of INTERCEPT 340, conserved amino acid 
residues having aromatic side chains can be foimd. For example, conserved tyrosine, 
tryptophan and phenylalanine residues can be found at amino acid 87, 88 and 133 of SEQ 
IDNO:2. 

MANGO 003 and TANGO 354 family members can mclude at least one, preferably 
two, and more preferably three immunoglobulin domains. As used herein, an 
"immimoglobulin domain" (also referred to herein as "Ig") refers to an amino acid sequence 
of about 45 to 85, preferably about 55-80, more preferably about 57, 58, or 78, 79 amino 
acids in length. Preferably, the immunoglobulin domams have a bit score for the alignment 
of the sequence to the Ig family Hidden Markov Model (HMM) of at least 10, preferably 
20-30, more preferably 22-40, more preferably 40-50, 50-75, 75-100, 100-200 or greater. 
The Ig family HMM has been assigned the PFAM Accession PF00047. Consensus hidden 
Markov model immunoglobulin domains are shown Figures 6 and 23 (SEQ ID NO:37). 
The more conserved residues in the consensus sequence are indicated by uppercase letters 
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and the less conserved residues in the consensus sequence are indicated by lowercase 
letters. Immunoglobulin domains are present in a variety of proteins (including secreted 
and membrane-associated proteins). Membrane-associated proteins may be involved in 
protein-protein, and protein-ligand interaction at the cell surface, and thus may influence 
diverse activities including cell surface recognition and/or signal transduction. The 
immunoglobulin domains of MANGO 003 extend from about amino acids 44 to 101, 165 to 
223, and 261 to 240 of SEQ ID N0:5 (SEQ ID NOs:38, 39, and 40, respectively) (Figure 
6). The immunoglobulin domain of TANGO 354 extend from about amino acids 33 to 1 10 
of SEQ ID NO:26 (SEQ ID N0:41) (Figure 23). 

MANGO 003 family member can include a neurotransmitter-gated ion channel 
domain. As used herein, a "neurotransmitter-gated ion channel domain*' refers to an amino 
acid sequence of about 5 to 20, preferably about 7 to 12, more preferably about 9 to 10 
amino acids in length. The neurotransmitter-gated ion channel domain HMM has been 
assigned the PFAM Accession PF00065. A consensus hidden Markov model 
neurotransmitter-gated ion channel domain contain the sequence of SEQ ID NO:42 shown 
in Figure 7. The more conserved residues in the consensus sequence are indicated by 
uppercase letters and the less conserved residues in the consensus sequence are indicated by 
lowercase letters. The neurotransmitter-gated ion channel domains of MANGO 003 extend 
from about amino acids 388 to 397 of SEQ ID N0:5 (SEQ ID NO:43). 

TANGO 272 family members can include at least one, two, three, four, five, six, 
seven, eight, nine, ten, eleven, twelve, preferably thirteen, and more preferably fourteen 
EGF-like domains. Preferably, the EGF-like domains are found in the extracellular domain 
of a TANGO 272 protein. As used herein, an "EGF-like domain" refers to an amino acid 
sequence of about 25 to 50, preferably about 30 to 45, and more preferably 30 to 40 amino 
acid residues in length. An EGF domain further contains at least about 2 to 10, preferably, 
3 to 9, 4 to 8, or 6 to 7 conserved cysteine residues. A consensus hidden Markov model 
EGF-like domain sequence includes six cysteines, all of which are thought to be involved in 
disulfide bonds having the following amino acid sequence: Cys-Xaa(5, 7)-Cys-Xaa(4, 5, 
12)-Cys-Xaa(l, 5, 6)-Cys-Xaa(l).Cys-Xaa(l)- Cys-Xaa(8)-Cys (SEQ ID NO:46), where 
Xaa is any amino acid. The region between the fifth and the sixth cysteine typically 
contains two conserved glycines of which at least one is present in most EGF-like domains. 

In one embodiment, TANGO 272 includes at least one EGF-like domain having the 
sequences selected from the group consisting of: amino acids 151-181 of SEQ ID N0:14 
(SEQ ID NO:49); amino acids 200-229 of SEQ ID NO: 14 (SEQ ID NO:50); amino acids 
242-272 of SEQ ID N0:14 (SEQ ID N0:51); amino acids 285-315 of SEQ ID NO:14 (SEQ 
ID NO:52); amino acids 328-358 of SEQ ID N0:14 (SEQ ID NO:53); amino acids 378-404 
of SEQ ID NO:14 (SEQ ID NO:54); amino acids 417-447 of SEQ ID NO:14 (SEQ ID 
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NO:55); amino acids 460-490 of SEQ ID NO:14 (SEQ ID.NO:56); amino acids 503-533 of 
SEQ ID NO:14 (SEQ ID NO:57); amino acids 546-576 of SEQ ID N0:14 (SEQ ID 
NO:58); amino acids 589-619 of SEQ ED N0:14 (SEQ ED NO:59); amino acids 632-661 of 
SEQ ID NO: 14 (SEQ ED NO:60); amino acids 674-704 of SEQ ID NO: 14 (SEQ ID 
NO:61); and amino acids 717-747 of SEQ ID N0:14 (SEQ ID NO:62). 

In another embodiment, TANGO 272 includes at least one EGF-like domain having 
the sequences selected from the group consisting of: 37-67 of SEQ ID NO: 1 7 (SEQ ID 
NO:64); amino acids 80-1 10 of SEQ ID N0:17 (SEQ ID NO:65); amino acids 123-153 of 
SEQ ID NO:17 (SEQ ID NO:66); and amino acids 166-196 of SEQ ID NO:17 (SEQ ED 
NO:67). 

In yet another embodiment, TANGO 272 includes at least one EGF-Hke domain 
having the sequences selected from the group consisting of: amino acids 18-48 of SEQ ID 
NO:20 (SEQ ID NO:73); amino acids 61-91 of SEQ ID NO:20 (SEQ ID NO:74); amino 
acids 105-137 of SEQ ID NO:20 (SEQ ID NO:75); amino acids 150-180 of SEQ ED NO:20 
(SEQ ID NO:76); amino acids 193-223 of SEQ ID NO:20 (SEQ ED NO:77); amino acids 
236-266 of SEQ ED NO:20 (SEQ ID NO:78); amino acids 279-309 of SEQ ID NO:20 (SEQ 
ED NO:79); amino acids 322-352 of SEQ ED NO:20 (SEQ ED NO:80); amino acids 365-394 
of SEQ ID NO:20 (SEQ ID N0:81); amino acids 407-437 of SEQ ID NO:20 (SEQ ED 
NO:82); and amino acids 450-480 of SEQ ED NO:20 (SEQ ID NO:83). 

An alignment of the consensus hidden Markov model EGF-like domains with the 
EGF-like domains of human TANGO 272 is shown in Figures 15A-15C. The more 
conserved residues in the consensus sequence are indicated by uppercase letters and the less 
conserved residues in the consensus sequence are indicated by lowercase letters. By 
alignment of the amino acid sequence of the consensus hidden Markov model EGF-like 
domain with the amino acid sequence of the EGF-like domains of TANGO 272, conserved 
cysteine residues can be found. For example, conserved cysteine residues can be found at 
amino acid 151, 159, 164, 167, 200, 206, 211, 218, 220, 229, 242, 249, 263, 264, 272, 285, 
291, 297, 304, 306, 315, 328, 334, 340, 347, 349, 358, 378, 386, 393, 395, 404, 417, 423, 
429, 436, 438, 447, 460, 466, 472, 479, 481, 490, 503, 509, 515, 522, 524, 533, 546, 552, 
558, 565, 567, 576, 589, 595, 601, 608, 610, 619, 632, 637, 643, 650, 652, 661, 674, 680, 
686, 693, 695, 717, 723, 729, 736, 738 and 747 of SEQ JD NO:14. 

TANGO 272 family members can include at least one delta serrate ligand domain. 
As used herein, a "delta serrate ligand domain** (also referred to herein as a "DSL domain*') 
refers to an amino acid sequence of about 30-70, more preferably 45-60* and most 
preferably 58 amino acids in length typically found in transmembrane signaling molecules 
that regulate differentiation in metazoans (Lissemore et aL, 1999, Mol. Phylogenet. EvoL 
1 1(2):308-19). In one embodiment, human TANGO 272 includes a deha serrate ligand 
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domain from about amino acids 518 to 576 of SEQ ID NO: 14 (SEQ ID NO:63); and about 
amino acids 246 to 309 of SEQ ID NO:20 (SEQ ID NO:95). Figure 15B depicts an 
alignment of the consensus hidden Markov model delta serrate ligand domain (SEQ ID 
NO:47) with this domain in human TANGO 272 at amino acids 518 to 576 of SEQ ID 
NO: 14 (SEQ ID NO:63). Figures 39A-.39B depict an alignment of the consensus hidden 
^ Markov model delta serrate ligand domain (SEQ ID NO:47) with this domain in mouse 
TANGO 272 at amino acids 10 to 67 of SEQ ID N0:17 (SEQ ID NO:72), Figures 41 A- 
4 IB depict an alignment of the consensus hidden Markov model delta serrate ligand domain 
(SEQ ID NO:47) with this domain in rat TANGO 272 at amino acids 246 to 309 of SEQ ID 
NO:20(SEQIDNO:95). 

TANGO 272 family members can include at least one RGD cell attachment site. As 
used herein, the term '*RGD cell attachment site" refers to a cell adhesion sequence 
consisting of amino acids Arg-Gly-Asp typically found in extracellular matrix proteins such 
as collagens, laminin and fibronectin, among others (reviewed in Ruoslahti, 1996, Annu. 
Rev, Cell Dev. BioL 12:697-715). Preferably, the RGD cell attachment site is located in the 
extracellular domain of a TANGO 272 protein and interacts {e.g., binds to) a cell surface 
receptor, such as an integrin receptor. As used herein, the term "integrin" refers to a family 
of receptors comprising a/p heterodimers that mediate cell attachment to extracellular 
matrices and cell-cell adhesion events. The a subunits vary in size between 120 and 180 
kDa and are each noncovalently associated with a P subunit (90-1 10 kDa) (reviewed by 
Hynes, 199i2, Ce// 69:11-25). Most integrins are expressed in a wide variety of cells, and 
most cells express several integrins. There are at least 8 known a subunits and 14 known P 
subunits. The majority of the integrin ligands are extracellular matrix proteins involved in 
substratum cell adhesion such as collagens, laminin, fibronectin among others. The RGD 
cell attachment site is located at about amino acid residues 177-179 of SEQ ID NO: 14. 

or 

MANGO 347 family members can include a CUB domain sequence. As used 
herein, the tem "CUB domain" includes an amino acid sequence having at least about 80- 
150, preferably 90-130, more preferably 96-120, and most preferably about 110 amino acids 
in length. Preferably, a CUB domain further includes at least one, preferably two, three, 
and most preferably four conserved cysteine residues. Preferably, the conserved cysteine 

OA 

residues form at least one, and preferably two disulfide bridges (e.g., Cysl-Cys2, and Cys3- 
Cys4) resulting in a P-barrel configuration. The CUB domain of MANGO 347 extends 
from about amino acid 40 to amino acid 1 36 of SEQ ID NO: 1 1 (SEQ ID NO:45). Figure 12 
depicts an alignment of the consensus hidden Markov model CUB domain (SEQ ID NO:44) 
with this domain in human MANGO 347 at amino acids 40 to 136 of SEQ ID NO: 1 1 (SEQ 
^5 IDNO:45). 
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TANGO 295 family members can include a pancreatic ribonuclease domain 
sequence. As used herein, the term "pancreatic ribonuclease domain" includes an amino 
acid sequence having at least about 100 to 150, preferably 1 10-140, more preferably 120- 
130, and most preferably 124 amino acids in length. Preferably, a pancreatic ribonuclease 
domain further includes at least one, preferably two, three, four and most preferably five 
^ conserved cysteine residues and an amino acid residue, e.g., a lysine, which is involved in 
catalytic activity. Preferably, at least one cysteine residue is involved in a disulfide bond, a 
lysine residue is involved in catalytic activity, and three other residues involved in substrate 
binding. Proteins having the pancreatic ribonuclease domain are pyrimidine-specific 
endonucleases present in high quantities in the pancreas of a number of mammalian taxa 
and of a few reptiles. The pancreatic ribonuclease domain of TANGO 295 extends from 
about amino acid 32 to amino acid 156 of SEQ ID NO:23 (SEQ ID NO:97). Figure 20 
depicts an ahgnment of the consensus hidden Markov model pancreatic ribonuclease 
domain (SEQ ID NO:96) with this domain inhuman TANGO 295 at amino acids 32 to 156 
of SEQ ID NO:23 (SEQ ID NO:97). 

Based on structural similarities, TANGO 378 family members can be classified as 
members of the superfamily of G-protein coupled receptor. As used herein, the term "G 
protein-coupled receptor*' or "GPCR*' refers to a family of proteins that preferably comprise 
an N-terminal extracellular domain, seven transmembrane domains (also referred to as 
membrane-spanning domains), three extracellular domains (also referred to as extracellular 

90 

loops), three cytoplasmic domains (also referred to as cytoplasmic loops), and a C-terminal 
cytoplasmic domain (also referred to as a cj^oplasmic tail). Members of the GPCR family 
also share certain conserved amino acid residues, some of which have been determined to 
be critical to receptor function and/or G protein signaling. An alignment of the 
transmembrane domains of 44 representative GPCRs can be found at 
http://mgdkkLnidll.nih.gov:8000/extended.html. 

Accordingly, in one embodiment, TANGO 378 family members can include at least 
one, two, three, four, five, six, or preferably, seven transmembrane domains, and thus has a 
"7 transmembrane receptor profile**. As used herein, the term "7 transmembrane receptor 
profile" includes an amino acid sequence having at least about 10-300, preferably about 15- 
200, more preferably about 20-100 amino acid residues, or at least about 22-100 amino 
acids in length and having a bit score for the alignment of the sequence to the 7tm_l family 
Hidden Markov Model (HMM) of at least 10, preferably 20-30, more preferably 22-40, 
more preferably 40-50, 50-75, 75-100, 100-200 or greater. The 7tm_l family HMM has 
been assigned the PFAM Accession PFOOOOl 

(http://genome.wustl.edu/PfamANWWdata/7tm_l.html). In one embodiment, the seven 
transmembrane domains of TANGO 378 extend from about amino acids 245 to about 
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amino acid 269 of SEQ ID NO:29 (SEQ ID NO:135), about amino acids 287 to about 
amino acid 306 of SEQ ID NO:29 (SEQ ID NO: 136), about amino acids 323 to about 
amino acid 343 of SEQ ID NO:29 (SEQ ID NO: 137), about amino acids 358 to about 
amino acid 376 of SEQ ID NO:29 (SEQ ID NO:138), about amino acids 414 to about 
amino acid 438 of SEQ ID NO:29 (SEQ ID NO: 139), about amino acids 457 to about 

^ amino acid 477 of SEQ ID NO:29 (SEQ ID NO:140), and about amino acids 485 to about 
amino acid 504 of SEQ ID NO:29 (SEQ LD N0:141); and a C-terminal cytoplasmic domain 
which extends from about amino acid 505 to amino acid 528 of SEQ ID N0:29 (SEQ ID 
NO: 142). Figure 26 depicts an alignment of each of the transmembrane domains of 
TANGO 378 with the consensus hidden Markov model seven transmembrane receptor 

^0 domain (SEQ ID NO:98). 

To identify the presence of a 7 transmembrane receptor profile in a TANGO 378, the 
amino acid sequence of the protein is searched against a database of HMMs (e.g., the Pfam 
database, release 2.1) using the default parameters 

(http://www.sanger.ac.uk/Software/PfamyHMM_search). For example, the hmmsf program, 
which is available as part of the HMMER package of search programs, is a family specific 
default program for PFOOOOl and score of 15 is the default threshold score for determining 
a hit. Alternatively, the seven transmembrane domain can be predicted based on stretches 
of hydrophobic amino acids forming a-helices (SOUSI server). Accordingly, proteins 
having at least 50-60% identity, preferably about 60-70%, more preferably about 70-80%, 
or about 80-90% identity with the 7 transmembrane receptor profile of hunian TANGO 378 
are within the scope of the invention. 

TANGO 378 family members can include at least one, preferably two, and most 
preferably three extracellular loops. As defined herein, the term "loop" includes an amino 
acid sequence having a length of at least about 4, preferably about 5-10, preferably about 

25 10-20, and more preferably about 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, 90-100, 
or 100-150 amino acid residues, and has an amino acid sequence that connects two 
transmembrane domains within a protein or polypeptide. Accordingly, the N-terminal 
amino acid of a loop is adjacent to a C-terminal amino acid of a transmembrane domain in a 
naturally-occurring TANGO 378 or TANGO 378-like molecule, and the C-terminal amino 

•'^ acid of a loop is adjacent to an N-terminal amino acid of a transmembrane domain in a 
naturally-occurring TANGO 378 or TANGO 378-like molecule. As used herein, an 
"extracellular loop" includes an amino acid sequence located outside of a cell, or 
extracellularly. For example, an extracellular loop can be found at about amino acids 307- 
322, 377-413, and 478-484 of SEQ ID NO:29. 

TANGO 378 family members can include at least one, preferably two, and most 
preferably three cytoplasmic loops. As used herein, a "cytoplasmic loop" includes an amino 
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add sequence lobated within a cell or within the cytoplasm of a cell For example, a 
cytoplasmic loop is found at about amino acids 270-286, 344-357, and 439-456 of SEQ ID 
NO:29. 

In one embodiment, a MANGO 003, a TANGO 272, a TANGO 354 or a TANGO 
378 family member can include one or more of the following domains: (1) an N-terminal 
extracellular domain, (2) a transmembrane domain, or (3) a C-terminal cytoplasmic domain. 

MANGO 003, a TANGO 272, a TANGO 354 or a TANGO 378 family member can 
include an extracellular domain: When located at the N-terminal domain the extracellular 
domain is referred to herein as an "N-terminal extracellular domain" or an "extracellular 
domain". As used herein, an "N-terminal extracellular domain" includes an amino acid 
sequence having about 1-800, preferably about 1-746, more preferably about 1-650, more 
preferably about 1-550, more preferably about 1-369, about 150 amino acid residues in 
length and is located outside of a cell or extracellularly. The C-terminal amino acid residue 
of a "N-terminal extracellular domain" is adjacent to an N-terminal amino acid residue of a 
transmembrane domain in a naturally-occurring MANGO 003, TANGO 272, TANGO 354 
or TANGO 378 protein. Preferably, the N-terminal extracellular domain is capable of 
interacting (e.g., binding to) with an extracellular signal, for example, a ligand (e.g.^ a 
glycoprotein hormone) or a cell surface receptor (e.g., an integrin receptor). Most 
preferably, the N-terminal extracellular domain mediates a variety of biological processes, 
for example, protein-protein interactions, signal transduction and/or cell adhesion* In one 
embodiment, an N-tenninal cytoplasmic domain is located at iabout amino acids 25-374 of 
SEQ ID NO:5 (SEQ ID NO:103); about amino acids 1-73 of SEQ ID NO:8 (SEQ ID 
NO:107); at about amino acids 21-767 of SEQ ID N0:14 (SEQ ID NO:114); at about amino 
acids 1-216 of SEQ ID NO: 17 (SEQ ID NO:l 18); at about amino acids 1'.500 of SEQ ID 
NO:20 (SEQ ID NO:122); at about amino acids 20-169 of SEQ ID NO:26 (SEQ ID 
NO: 129); and at about amino acids 22-244 of SEQ ID NO:29 (SEQ K) NO:134). 

In another embodiment, a MANGO 003, a TANGO 272, a TANGO 354 or a 
TANGO 378 family member can include a transmembrane domain. As used herein, the 
term "transmembrane domain" includes an amino acid sequence of about 15 amino acid 
residues in length which spans the plasma membrane. More preferably, a transmembrane 
domain includes about at least 20, 25, 30, 35, 40, or 45 amino acid residues and spans the 
plasma membrane. Transmembrane domains are rich in hydrophobic residues, and 
typically have an cc-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 
80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, 
e.g., leucines, isoleucines, tyrosines, or tryptophans. Transmembrane domains are 
described in, for example, http ://pfam. wustl. edu/cgi-bin/getdesc?name-7tm- 1 and Zagotta 
et al, 1996, Annual Rev. NeuronscL 19: 235-63, the contents of which are incorporated 
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herein by reference. Amino acid residues 375-398 of SEQ ID N0:5 (SEQ ID NO: 104), 74- 
96 of SEQ ID N0:8 (SEQ ID NO:108). 768-791 of SEQ ID NO:14 (SEQ ID N0:115), 217- 
240 of SEQ ID NO:17 (SEQ ID N0:119), 501-524 of SEQ ID NO:20 (SEQ ID NO:123); 
170-193 of SEQ ID NO:26 (SEQ ID NO:130), and 245-269, 287-306, 323-343, 358-376, 
414-438, 457-477 and 485-504 of SEQ ID NO:29 (SEQ ID NOs:135-141) include 

^ transmembrane domains. 

A MANGO 003, TANGO 272, TANGO 354 or TANGO 378 family member can 
include a C-teiminal cytoplasmic domain. As used herein, a "C-terminal cytoplasmic 
domain*' includes an amino acid sequence having a length of at least about 10, preferably 
about 10-25, more preferably about 25-50, more preferably about 50-75, even more 
preferably about 75-100, 100-133, 133-150, 150-200, 200-250, 250-300, 300-400, 400-500, 
or 500-600 amino acid residues and is located within a cell or within the cytoplasm of a 
cell Accordingly, the N-terminal amino acid residue of a "C-terminal cytoplasmic domain" 
is adjacent to a C-temiinal amino acid residue of a transmembrane domain in a naturally- 
occurring MANGO 003, TANGO 272, TANGO 354 or TANGO 378 protein. For example, 
a C-terminal cytoplasmic domain is found at about amino acid residues 399-504 of SEQ ID 
NO:5, 97-208 of SEQ ID NO:8, 792-1050 of SEQ ID N0:14, 241-497 of SEQ ID NO:17, 
525-636 of SEQ ID NO:20; 194-305 of SEQ ID NO:26, and 505-528 of SEQ ID NO:29. 

MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 
378 family members can include a signal peptide. As used herein, a "signal peptide" 
includes a peptide of at least about 1 5 amino acid residues in length which occurs at the N- 
terminus of secretory and membrane-bound proteins and which contains at least about 70% 
hydrophobic amino acid residues such as alanine, leucine, isoleucine, phenylalanine, 
proline, tyrosine, tryptophan, or valine. The sequence can contain about 15 to 45 amino 
acid residues or about 17-22 amino acid residues, and has at least about 60-80%, 65-75%, or 

" about 70% hydrophobic residues. A signal peptide serves to direct a protein containing 
such a sequence to a lipid bilayer. Thus, in one embodiment, a MANGO 003 protein 
contains a signal peptide of about amino acids 1-22, 1-23, 1-24, 1-25, or 1-26 of SEQ ID 
NO:5 (SEQ ID NO:101). In one embodiment, a MANGO 347 protein contains a signal 
peptide of about amino acids 1-33, 1-34, 1-35, 1-36, or 1-37 of SEQ ID N0:1 1 (SEQ ID 
N0:1 10). In one embodiment, a TANGO 272 protein contains a signal peptide of amino 
acids 1-18, 1-19, 1-20, 1-21, or 1-22 of SEQ ID NO:14 (SEQ ID N0:112). In yet another 
embodiment, a TANGO 295 protein contains a signal peptide of amino acids 1-26, 1-27, 1- 
28, 1-29, or 1-30 of SEQ ID NO:23 (SEQ ID NO:125). In another embodiment, a TANGO 
354 protein contains a signal peptide of amino acids 1-17, 1-18, 1-19, 1-20, or 1-21 of SEQ 
IDNO:26(SEQIDNO:127). In another embodiment, a TANGO 378 protein contains a 
signal peptide of amino acids 1-19, 1-20, 1-21, 1-22, or 1-23 of SEQ ID NO:29 (SEQ ID 
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NO: 132), The signal peptide is cleaved during processing of the mature protein. The 
amino acid sequence of the mature MANGO 003, MANGO 347, TANGO 272, TANGO 
295, TANGO 354, or TANGO 378 protein starts at the next amino acid after the signal 
peptide is cleaved. For example, the amino acid sequence of MANGO 003 may start at 
amino acids 23, 24, 25, 26, or 27 depending on the exact location of the cleavage of the 
^ signal peptide. 

The signal peptide is cleaved during processing of the mature protein. Sometimes 
the initial methionine residue is also cleaved from the protein during signal peptide 
processing. Thus, in one embodiment, a MANGO 003 protein does not contain a signal 
peptide or an initial methionine residue and begins from residue 2 of SEQ ID NO: 102. In 
one embodiment, a MANGO 347 protein does not contain a signal peptide or an initial 
methionine residue and begins from residue 2 of SEQ ID N0:1 11. In one embodiment, a 
TANGO 272 protein does not contain a signal peptide or an initial methionine residue and 
begins from residue 2 of SEQ ID NO:l 13. Thus, in one embodiment, a TANGO 295 
protein does not contain a signal peptide or an initial methionine residue an begins from 

^ ^ residue 2 of SEQ ID NO: 1 26. Thus, in one embodiment, a TANGO 354 protein does not 
contain a signal peptide or an initial methionine residue an begins from residue 2 of SEQ ID 
NO: 128. Thus, in one embodiment, a TANGO 378 protein does not contain a signal 
peptide or an initial methionine residue an begins from residue 2 of SEQ ID NO: 133. 

In one embodiment, a MANGO 003 family member includes three immunoglobulin 
domains and a neurotransmitter-gated ion channel domain. In another embodiment, a 
MANGO 003 family member includes three immunoglobulin domains, a neurotransmitter- 
gated ion channel domain and a transmembrane domain. In yet another embodiment, a 
MANGO 003 family member includes three immunoglobulin domains, a neurotransmitter- 
gated ion channel domain, a transmembrane domain and an N-terminal extracellular 
domain. In another embodiment, a MANGO 003 family member includes three 
immunoglobulin domains, a neurotransmitter-gated ion channel domain, a transmembrane 
domain, an N-terminal extracellular domain and a C-terminal cytoplasmic domain. In yet 
another embodiment, a MANGO 003 family member includes three immunoglobulin 
domains, a neurotransmitter-gated ion channel domain, a transmembrane domain, an N- 

•^^ terminal extracellular domain, a C-terminal cytoplasmic domain, and a signal peptide. 
In one embodiment, a MANGO 354 family member includes at least one 
immunoglobulin domain and a transmembrane domain. In another embodiment, a 
MANGO 354 family member includes at least one immunoglobulin domain, a 
transmembrane domain and a signal peptide. 

In one embodiment, a TANGO 272 family member includes fourteen EGF-like 
domains and a delta serrate ligand domain. In another embodiment, a TANGO 272 family 
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member includes fourteen EGF-like domains, a delta serrate ligand domain and an RGD 
cell attachment site. In yet another embodknent, a TANGO 272 family member includes 
fourteen EGF-like domains, a delta serrate ligand domain, an RGD cell attachment site, and 
a transmembrane domain. In another embodiment, a TANGO 272 family member includes 
fourteen EGF-like domains, a delta serrate ligand domain, an RGD cell attachment site, a 
transmembrane domain, and an extracellular N-terminal domain. In another embodiment, a 
TANGO 272 family member includes fourteen EGF-like domains, a delta serrate ligand 
domain, an RGD cell attachment site, a transmembrane domain, an extracellular N-terminal 
domain and a C-terminal cytoplasmic domain. In another embodiment, a TANGO 272 
family member includes fourteen EGF-like domains, a delta serrate ligand domain, an RGD 
cell attachment site, a transmembrane domain, an extracellular N-terminal domain, a C- 
terminal cytoplasmic domain, and a signal peptide. 

In one embodiment, a TANGO 378 family member includes a 7 transmembrane 
receptor profile and three extracellular loops. In another embodiment, a TANGO 378 
family member includes a 7 transmembrane receptor profile, three extracellular loops, and 
three cytoplasmic loops. In yet another embodiment, a TANGO 378 family member 
includes a 7 transmembrane receptor profile, three extracellular loops, three cytoplasmic 
loops, and an extracellular N-terminal domain^ In another embodiment, a TANGO 378 
family member includes a 7 transmembrane receptor profile, three extracellular loops, three 
cytoplasmic loops, an extracellular N-terminal domain, and a C-terminal cytoplasmic 
domain. In another embodiment, a TANGO 378 family member includes a 7 
transmembrane receptor profile, three extracellular loops, three cj^oplasmic loops, an 
extracellular N-terminal domain, a C-terminal cytoplasmic domain, and a signal peptide. 

Various features of INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, 
TANGO 295, TANGO 354, and TANGO 378 are summarized below. 

INTERCEPT 340 

A cDNA encoding INTERCEPT 340 was identified by analyzing the sequences of 
clones present in a human fetal spleen cDNA library. 

This analysis led to the identification of a clone, jthsal02bl2, encoding full-length 
human INTERCEPT 340. The cDNA of this clone is 3284 nucleotides long (Figures lA- 
IB; SEQ ID NO: 1). The 723 nucleotide open reading fi'ame of this cDNA, nucleotides 
1222-1944 of SEQ ID N0:1 (SEQ ID NO:3), encodes a 241 amino acid protein (Figures 
1A-1B;SEQIDN0:2). 

Human INTERCEPT 340 that has not been post-translationally modified is 
predicted to have a molecular weight of 27.2 kDa. 
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Human INTERCEPT 340 includes three fibrillar collagen C-tenninal (COLF) 
domains at amino acids 58-1 16 of SEQ ID NO:2 (SEQ ID NO:34); amino acids 126-151 of 
SEQ ID NO:2 (SEQ ID NO:35); and amino acids 186-217 of SEQ ID NO:2 (SEQ ID 
NO:36). Figure 3 depicts alignments of each of the COLF domains of human INTERCEPT 
340 with consensus hidden Markov model COLF domains (SEQ ID N0s:31, 32, and 33). 
In one embodiment, INTERCEPT 340 is a secreted protein. In another embodiment, 
INTERCEPT 340 is a membrane-associated protein. 

An N-glycosylation site is present at amino acids 1 05-1 08 of SEQ ID N0:2. A 
glycosaminoaglycan attachment site is present at amino acids 161-164 of SEQ ID NO:2. 
Protein kinase C phosphorylation sites are present at amino acids 57-59, 152-154, and 227- 
229 of SEQ ID NO:2. A tyrosine kinase phosphorylation site is present at amino acids 81- 
87 of SEQ ID N0:2, Casein kinase II phosphorylation sites are present at amino acids 36- 
39, 120-123 and 181-184. N-myristylation sites are present at amino acids 109-1 14 and 
164-169 of SEQ ID NO:2. 

Clone jthsal02bl2, which encodes human INTERCEPT 340, was deposited as a 
composite deposit having a designation EpI340 with the American Type Culture Collection 
(ATCC® 10801 University Boulevard, Manassas, VA 20110-2209) on June 18, 1999 and 
assigned Accession Number PTA-250. A description of the deposit conditions is set forth 
in the section entitled "Deposit of Clones" below. This deposit will be maintained under 
the terms of the Budapest Treaty on the International Recognition of the Deposit of 
Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a 
convenience for those of skill in the art and is not an admission that a deposit is required 
under 35 U.S.C. §112. 

Figure 2 depicts a hydropathy plot ofhuman INTERCEPT 340. Relatively 
hydrophobic regions are above the horizontal line, and relatively hydrophilic regions are 
below the horizontal line. The cysteine residues (cys) are indicated by short vertical lines 
just below the hydropathy trace. 

Use of INTERCEPT 340 Nucleic Acids, Polypeptides, and Modulators Thereof 

INTERCEPT 340 includes three fibrillar collagen C-terminal domains. Proteins 
having such domains play a role in modulating connective tissue formation and/or 
maintenance, and thus can influence a wide variety of biological processes, including 
assembly into fibrils; strengthening and organization of the extracellular matrix; shaping of 
tissues and cells; modulation of cell migration; and/or modulation of signal transduction 
pathways. Because INTERCEPT 340 includes fibrillar collagen C-terminal domains, 
INTERCEPT 340 polypeptides, nucleic acids, and modulators thereof can be used to treat 
connective tissue disorders, including a skin disorder and/or a skeletal disorder {e.g., Marfan 
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syndrome and osteogenesis imperfecta); cardiovascular disorders including 
hyperproliferative vascular diseases (eg"., hypertension, vascular restenosis and 
atherosclerosis), ischemia reperfusion injury, cardiac hypertrophy, coronary artery disease, 
myocardial infarction, arrhythmia, cardiomyopathies, and congestive heart failure); and/or 
hematopoietic disorders (e.g., myeloid disorders, lymphoid malignancies, T cell disorders). 

As INTERCEPT 340 was originally found in a fetal spleen library, INTERCEPT 
340 nucleic acids, proteins^ and modulators thereof can be used to modulate the function, 
survival, morphology, migration, proliferation and/or differentiation of cells that form the 
spleen, e.g., cells of the splenic connective tissue, e.g,, splenic smooth muscle cells and/or 
endothelial cells of the splenic blood vessels. INTERCEPT 340 nucleic acids, proteins, and 
modulators thereof can also be used to modulate the proHferation, differentiation, and/or 
function of cells that are processed, e.g,^ regenerated or phagocytized within the spleen, e.g., 
erythrocytes and/or B and T lymphocytes and macrophages. Thus INTERCEPT 340 
nucleic acids, proteins, and modulators thereof can be used to treat spleen, e.g., the fetal 
spleen, associated diseases and disorders. Examples of splenic diseases and disorders 
include e.g., splenic lymphoma and/or splenomegaly, and/or phagocytotic disorders, e.g., 
those inhibiting macrophage engulfexent of bacteria and viruses in the bloodstream. 

Further, in light of INTERCEPT 340's presence in a human fetal spleen cDNA 
library, INTERCEPT 340 expression can be utilized as a marker for specific tissues (e.g., 
lymphoid tissues such as the spleen) and/or cells (e.g,, splenic) in which INTERCEPT 340 
is expressed. INTERCEPT 340 nucleic acids can also be utilized for chromosomat 
mapping. 
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MANGO 003 

A cDNA encoding human MANGO. 003 was identified by analyzing the sequences 
of clones present in a human thyroid cDNA library. 

This analysis led to the identification of a clone, jthYa030d03, encoding full-length 
human MANGO 003. The cDNA of this clone is 3169 nucleotides long (Figures 4A'4B; 
SEQ ID N0:4). The 1512 nucleotide open reading frame of this cDNA, nucleotide 57 to 
nucleotide 1568 of SEQ ID N0:4 (SEQ ID NO:6), encodes a 504 amino acid protein 
(Figures 4A-4B; SEQ ID N0:5). 

Human MANGO 003 that has not been post-translationally modified is predicted to 
have a molecular weight of 54.5 kDa prior to cleavage of its signal peptide (52. 1 kDa after 
cleavage of its signal peptide). 
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The signal peptide prediction program SIGNAL? (Nielsen et al., 1997, Protein 
Engineering 10:1-6) predicted that human MANGO 003 includes a 24 amino acid signal 
peptide at amino acid 1 to about amino acid 24 of SEQ ID N0:5 (SEQ ID NO:101) 
preceding the mature human MANGO 003 protein which corresponds to about amino acid 
25 to amino acid 504 of SEQ ID NO:5 (SEQ ID NO;102). 
^ Human MANGO 003 is a transmembrane protein having an extracellular domain 

which extends jfrom about amino acid 25 to about amino acid 374 of SEQ ID NO:5 (SEQ 
ID NO: 103), a transmembrane domain which extends jfrom about amino acid 375 to about 
amino acid 398 of SEQ ID NO:5 (SEQ ID NO: 104), and a cytoplasmic domain which 
extends from about amino acid 399 to amino acid 504 of SEQ ID NO:5 (SEQ ID NO:105). 

Alternatively, in another embodiment, a human MANGO 003 protein contains an 
extracellular domain which extends from about amino acid 399 to amino acid 504 of SEQ 
ID N0:5 (SEQ ID NO:105)> a transmembrane domain which extends from about amino 
acid 375 to about amino acid 398 of SEQ ID NO:5 (SEQ ID NO: 104), and a cytoplasmic 
domain which extends from about amino acid 25 to about amino acid 374 of SEQ ID N0:5 
(SEQIDNO:103), 

Human MANGO 003 includes three immunoglobulin domains at amino acids 44- 
101 of SEQ ID NO:5 (SEQ ID NO:38); amino acids 165-223 of SEQ ID N0:5 (SEQ ID 
NO:39); and amino acids 261-340 of SEQ ID N0:5 (SEQ ID NO:40). Figure 6 depicts 
alignments of each of the immunoglobulin domains of MANGO 003 with a consensus 

on 

hidden Markov model inmiunoglobulin domain (SEQ ID NO:37). 

Human MANGO 003 includes a neurotransmitter gated ion channel domain at 
amino acids 388-397 of SEQ ID NO:5 (SEQ ID NO:43). Figure 7 depicts an alignment of 
the neurotransmitter gated ion channel domain of human MANGO 003 with a 
neurotransmitter gated ion channel domain derived from a hidden Markov model (SEQ ID 
NO:42). 

N*glycosylation sites are present at amino acids 111-114, 23 1-234, 255-258, and 
293-296 of SEQ ID NO:5. A cAMP and cGMP-dependent protein kinase phosphorylation 
site is present at amino acids 202-205 of SEQ ID N0:5. Protein kinase C phosphorylation 
sites are present at amino acids 44-48, 167-169, 207-209, 216-218, 220-222, 224-226, 233- 
235, 347-349, and 422-424 of SEQ ID NO:5, Casein kinase II phosphorylation sites are 
present at amino acids 192-195, 256-259, 294-297, 313-316, 422-425, and 490-493 of SEQ 
ID N0:5. Tyrosine kinase phosphorylation sites are present at amino acids 212-219 and 
329-336 of SEQ ID NO:5* N-myristylation sites are present at amino acids 95-100, 228- 
233, 261-266, 317-322, 334-339, 382-387, and 443-448 of SEQ ED NO:5. 

Clone jthYa030d03, which encodes human MANGO 003, was deposited as a 
composite deposit having a designation EpthLa6al with the American Type Culture 
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Collection (ATCC® 1 0801 University Boulevard, Manassas, VA 201 10-2209) on March 27, 
1 999 and assigned Accession Number 207 1 78. This deposit will be maintained under the 
terms of the Budapest Treaty on the International Recognition of the Deposit of 
Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a 
convenience for those of skill in the art and is not an admission that a deposit is required 
^ under35U.S.C. §112. 

Figure 5 depicts a hydropathy plot of human MANGO 003. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 5 indicates the presence of a 
hydrophobic domain within human MANGO 003, suggesting that human MANGO 003 is a 
transmembrane protein. 

A cDNA encoding mouse MANGO 003 was identijfied by analyzing the sequences 
of clones present in a mouse choroid plexus cDNA library. 

This analysis led to the identification of a clone, jfinjfD04cl 1, encoding partial 
mouse MANGO 003. The cDNA of this clone is 504 nucleotides long (Figures 8A-8B; 
SEQ ID NO:7). The 626 nucleotide open reading frame of this cDNA, nucleotides 1-626 of 
SEQ ID N0:7 (SEQ ED NO:9), encodes a 208 amino acid protein (Figures 8A-8B; SEQ ID 
N0:8). 

Northern blot analysis using the mouse clone jfinjf004cl 1 revealed strong 
expression of the mouse MANGO 003 gene m the mouse liver, skeletal muscle and kidney. 
Moderate expression was detected in the heart, lung and testis, and lower levels of 
expression were detected in the mouse brain. No expression was detected in the spleen. 

Mouse MANGO 003 that has not been post-translationally modified is predicted to 
have a molecular weight of 22.3 kDa. 

Mouse MANGO 003 is a transmembrane protein having an extracellular domain 
which extends firom about amino acid 1 to about amino acid 73 of SEQ ID NO:8 (SEQ ID 
NO: 107), a transmembrane domain which extends from about amino acid 74 to about 
amino acid 96 of SEQ ID NO:8 (SEQ ID NO: 108), and a cytoplasmic domain which 
extends from about amino acid 97 to amino acid 208 of SEQ ID NO:8 (SEQ ID NO:109). 

An N-glycosylation site is present at amino acids 190-1 93 of SEQ ID NO:8. Protein 
kinase C phosphorylation sites are present at amino acids 44-46, 98-100, 1 19-121, and 197- 
199 of SEQ ID N0:8, Casein kinase II phosphorylation sites are present at amino acids 10- 
13, and 1 19-122 of SEQ ID N0:8. A tyrosine kinase phosphorylation site is present at 
amino acids 26-33 of SEQ ID NO:8. N-myristylation sites are present at amino acids 14- 
19, 31-36, and 79-84 of SEQ ID N0:8. 
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Figure 9 depicts a hydropathy plot of mouse MANGO 003. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 9 indicates the presence of a 
hydrophobic domain within human MANGO 003, suggesting that human MANGO 003 is a 
^ transmembrane protein. 

A global alignment between the nucleotide sequence of the open reading frame 
(ORF) of human MANGO 003 (SEQ ID NO:6) and the nucleotide sequence of the open 
reading frame of mouse MANGO 003 (SEQ ID NO:9) revealed a 31.1% identity (Figures 
27A-27C), The global alignment was performed using the ALIGN program version 2.0u 
(Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global alignment score of 
-1212; Myers and Miller, 1989 CABIOS 4:1 1-7). 

A local alignment between the nucleotide sequence of human MANGO 003 (SEQ 
ID NO:4) and the nucleotide sequence of mouse MANGO 003 (SEQ ID NO:7) revealed a 
62.8 % identity over nucleotides 970-2080 of the human MANGO 003 sequence 
(nucleotides 10-1070 of mouse MANGO 003) (Figures 28A-28B), The local alignment was 
performed using the L- ALIGN program version 2.0u54 July 1996 (Matrix file used: pam 
120.mat, gap penalties of -12/-4 with a score of 3241; Huang and Miller, \99\,Adv, Appl 
Math. 12:373-81). 

A global alignment between the amino acid sequence of human MANGO 003 (SEQ 
ID N0:5) and the amino acid sequence of mouse MANGO 003 (SEQ ID N0:8) revealed a 
30. 1% identity (Figure 29). The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12A4 with a global 
alignment score of --488; Myers and Miller, 1989, CABIOSA\\ 1-7). 

Use of MANGO 003 Nucleic Acids. Polypeptides, and Modulators Thereof 

MANGO 003 includes three immimoglobulin-like domains. Proteins having such 
domains play a role in mediating protein-protein and protein-ligand interactions, and thus 
can influence a wide variety of biological processes, including cell surface recognition; 
transduction of an extracellular signal {e.g.^ by interacting with a ligand and/or a cell- 

30 

surface receptor); and/or modulation of signal transduction pathways. 

MANGO 003 further includes a neurotransmitter-gated ion channel domain. 
Proteins having such domains play a role in modulating signal transmission at chemical 
synapses by, for example, influencing processes, such as the release of neurotransmitters 
from a cell {e.g.y a neuronal cell); modulating membrane excitability and/or resting 
potential; and/or modulating ion flux across a membrane of a cell (e.g.^ a neuronal or a 
muscle cell). Because MANGO 003 includes a neurotransmitter-gated ion chaimel domain, 
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MANGO 003 polypeptides, nucleic acids, and modulators thereof can be used to treat 
neural disorders a CNS disorder, including Alzheimer's disease. Pick's disease, 
Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral 
sclerosis, progressive supranuclear palsy, epilepsy, and Jakob-Creutzfieldt disease; 
psychiatric disorders, e^., depression, schizophrenic disorders, Korsakoff s psychosis, 

^ mania, anxiety disorders, or phobic disorders; learning or memory disorders, e.g., amnesia 
or age-related memory loss; and neurological disorders, eg., migraine). 

MANGO 003 polypeptides, nucleic acids, and modulators thereof can be used to 
modulate function, survival, morphology, migration, proliferation and/or differentiation of 
cells in the tissues in which it is expressed (e.g. thyroid, liyer, skeletal muscle, kidney, 
heart, lung, testis and brain). For example, MANGO 003 polypeptides, nucleic acids, and 
modulators thereof can be used to modulate endocrine, hepatic, skeletal muscular, renal, 
cardiac, reproductive and/or brain function. Accordingly, these molecules can be used to 
treat a variety of disease including, but not limited to, endocrine disorders (eg., 
hypothyroidism, hyperthyroidism, dwarfism, giantism, acromegaly); hepatic disorders (eg., 
hepatitis, liver cirrhosis, hepatoma, liver cysts, and hepatic vein thrombosis); skeletal 
muscular disorders; renal disorders (eg., renal cell carcinoma, nephritis, polycystic kidney 
disease); cardiovascular disorders (eg., atherosclerosis, ischemia reperfusion injury, cardiac 
hypertrophy, h>pertension, coronary artery disease, myocardial infarction, arrhythmia, 
cardiomyopathies, and congestive heart failure); and/or reproductive disorders (eg., 

2^ sterility). 

MANGO 003 polypeptides, nucleic acids, or modulators thereof, can be used to treat 
hepatic (liver) disorders, such as jaundice, hepatic failure, hereditary hyperbiliruinemias 
(eg., Gilbert's syndrome, Crigler-Naijar syndromes and Dubin- Johnson and Rotor's 
syndromes), hepatic circulatory disorders (e.g., hepatic vein thrombosis and portal vein 
obstruction and thrombosis) hepatitis (eg., chronic active hepatitis, acute viral hepatitis, and 
toxic and drug-induced hepatitis) cirrhosis (eg., alcoholic cirrhosis, biliary cirrhosis, and 
hemochromatosis), or malignant tumors (eg. , primary carcinoma, hepatoblastoma, and 
angiosarcoma). 

In another example, MANGO 003 polypeptides, nucleic acids, or modulators 

OA 

thereof, can be used to treat disorders of skeletal muscle, such as muscular dystrophy (e.g., 
Duchenne Muscular Dystrophy, Becker Muscular Dystrophy, Emery-Dreifuss Muscular 
Dystrophy, Limb-Girdle Muscular Dystrophy, Facioscapulohumeral Muscular Dystrophy, 
Myotonic Dystrophy, Oculopharyngeal Muscular Dystrophy, Distal Muscular Dystrophy, 
and Congenital Muscular Dystrophy), motor neuron diseases (eg., Amyotrophic Lateral 
Sclerosis, Infantile Progressive Spinal Muscular Atrophy, Intermediate Spinal Muscular 
Atrophy, Spinal Bulbar Muscular Atrophy, and Adult Spinal Muscular Atrophy), 
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myopathies (e.^., inflammatory myopathies (e.g., Dermatomyositis and Polymyositis), 
Myotonia Congenita, Paramyotonia Congenita, Central Core Disease, Nemaline Myopathy, 
Myotubular Myopathy, and Periodic Paralysis), and metabolic diseases of muscle (e.g., 
Phosphorylase Deficiency, Acid Maltase Deficiency, Phosphofructokinase Deficiency, 
Debrancher Enzyme Deficiency, Mitochondrial Myopathy, Carnitine Deficiency, Carnitine 
Palmityl Transferase Deficiency, Phosphoglycerate Kinase Deficiency, Phosphoglycerate 
Mutase Deficiency, Lactate Dehydrogenase Deficiency, and Myoadenylate Deaminase 
Deficiency). 

In another example, MANGO 003 polypeptides, nucleic acids, or modulators 
thereof, can be used to treat renal disorders, such as glomerular diseases (e.^., acute and 
chronic glomerulonephritis, rapidly progressive glomemlonephritis, nephrotic syndrome, 
focal proliferative glomerulonephritis, glomerular lesions associated with systemic disease, 
such as systemic lupus erythematosus^ Goodpasture's syndrome, multiple myeloma, 
diabetes, neoplasia, sickle cell disease, and chronic inflanmiatpry diseases), tubular diseases 
(e,g. , acute tubular necrosis and acute renal failiu-e, polycystic renal diseasemedullary 
sponge kidney, medullary cystic disease, nephrogenic diabetes, and renal tubular acidosis), 
tubulointerstitial diseases (e.g., pyelonephritis, drug and toxin induced tubulointerstitial 
nephritis, hypercalcemic nephropathy, and hypokalemic nephropathy) acute and rapidly 
progressive renal failure, chronic renal failure, nephrolithiasis, vascular diseases (e.g., . 
hypertension and nephrosclerosis, microangiopathic hemolytic anemia, atheroembolic renal 
disease, diffiise cortical necrosis, and renal infarcts), or tumors (eg., renal cell carcinoma 
and nephroblastoma). 

Further, in light of MANGO 003 's pattern of expression in mice, MANGO 003 
expression can be utilized as a marker for specific tissues {e.g., liver, skeletal muscle, 
kidney) and/or cells (e.g., hepatic, skeletal muscle, renal) in which MANGO 003 is 
expressed. MANGO 003 nucleic acids can also be utilized for chromosomal mapping. 



MANGO 347 

A cDNA encoding human MANGO 347 was identified by analyzing the sequences 
of clones present in a human brain cDNA library. 

This analysis led to the identification of a clone, jlhbad295g 12, encoding fiilHength 
human MANGO 347. The cDNA of this clone is 1423 nucleotides long (Figure 10; SEQ 
ID NO:10). The 414 nucleotide open reading fi*ame of this cDNA, nucleotides 31 to 444 of 
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SEQ ID NO:10 (SEQ ID N0:12), encodes a 138 amino acid protein (Figure 10; SEQ ID 
N0:11), 

The signal peptide prediction program SIGNAL? (Nielsen et al., 1997, Protein 
Engineering 10:1-6) predicted that human MANGO 347 includes a 35 amino acid signal 
peptide at amino acid 1 to about amino acid 35 of SEQ ID NO: 1 1 (SEQ ID NO: 110) 
preceding the mature human MANGO 347 protein which corresponds to about amino acid 
36 to amino acid 1 38 of SEQ ID NO: 11 (SEQ ID NO: 1 1 1), 

Human MANGO 347 that has not been post-translationally modified is predicted to 
have a molecular weight of 15.4 kDa prior to cleavage of its signal peptide and a molecular 
weight of 1 1.3 kDa subsequent to cleavage of its signal peptide. 

Human MANGO 347 includes a CUB domain at amino acids 40-136 of SEQ ID 
NO: 1 1 (SEQ ID NO:45). An alignment of the CUB domain of human MANGO 347 with a 
consensus hidden Markov model CUB domain amino acid sequence derived from a hidden 
Markov model (SEQ ID NO:44) is shown in Figure 12* 

Casein kinase II phosphorylation sites are present at amino acids 67-70, and 108-1 1 1 
of SEQ ID NO: 1 1 . N-myristylation sites are present at amino acids 19-24, 3 1-36, 64-69, 
and 113-118 of SEQ IDNO:lL 

Clone jlhbad295gl2, which encodes human MANGO 347, was deposited as a 
composite deposit having a designation EpM347 with the American Type Culture 
Collection (ATCC® 10801 University Boulevard, Manassas, VA 201 10-2209) on June 18, 
1999 and assigned Accession Number PTA-250. A description of the deposit conditions 
used in set forth below. This deposit will be maintained under the terms of the Budapest 
Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes 
of Patent Procedure. This deposit was made merely as a convenience for those of skill in 
the art and is not an admission that a deposit is required under 35 U.S.C. § 1 12. 

Figure 1 1 depicts a hydropathy plot of human MANGO 347. Relatively 
hydrophobic regions are above the horizontal line, and relatively hydrophilic regions are 
below the horizontal line. The cysteine residues (cys) are indicated by short vertical lines 
just below the hydropathy trace. The hydropathy plot of Figure 1 1 indicates that human 
MANGO 347 has a signal peptide at its amino terminus, suggesting that human MANGO 
347 is a secreted protein* 

Use of MANGO 347 Nucleic Acids. Polypeptides, and Modulators Thereof 

MANGO 347 includes a CUB domain. Proteins having such a domain play a role in 
mediating cell interactions during development, and thus can influence a wide variety of 
developmental processes, including morphogenesis, cellular migration, adhesion, 
proliferation, differentiation, and/or survival. MANGO 347 polypeptides are expressed in 

- 35 - 



BNSDOCID: <WO ^0100673A1J^ 



neural (e,g,, brain cells). Because MANGO 347 includes a CUB domain and is expressed 
in neural cells, MANGO 347 polypeptides, nucleic acids, and modulators thereof can be 
used to treat disorders involving, e.g., cellular migration, proliferation, and differentiation of 
a cell, e.g., a neural cell (e.g., a CNS disorder, including Alzheimer's disease, Pick's disease, 
Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral 
^ sclerosis, progressive supranuclear palsy, epilepsy, and Jakob-Creutzfieldt disease; 
psychiatric disorders, e.g., depression, schizophrenic disorders, Korsakoff s psychosis, 
mania, anxiety disorders, or phobic disorders; learning or memory disorders, e.g^ amnesia 
or age-related memory loss; and neurological disorders, e*g., migraine). 

Further, in light of MANGO 347*s presence in a human brain cDNA library, 
MANGO 347 expression can be utilized as a marker for specific tissues (e.g., brain) and/or 
cells (e.g., brain) in which MANGO 347 is expressed, MANGO 347 nucleic acids can also 
be utilized for chromosomal mapping. 
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TANGO 272 

A cDNA encoding human TANGO 272 was identified by analyzing the sequences 
of clones present in a human microvascular endothelial cell library (HMVEC) cDNA 
library. 

This analysis led to the identification of a clone, jthda089h03, encoding full-length 
human TANGO 272. The cDNA of this clone is 5036 nucleotides long (Figures 13A-13D; 
SEQ ID NO: 13). The 3 149 nucleotide open reading jframe of this cDNA^ nucleotides 230- 
3379 of SEQ ID NO: 13 (SEQ JD NO: 15), encodes a 1050 amino acid protein (Figures 13A- 
13D; SEQ ID NO: 14). 

Northem blot analysis using the human clone jthda089h03 revealed strong 
expression of the human TANGO 272 gene in the heart. Moderate expression was detected 
in the placenta, lung, and liver, and lower levels of expression were detected in the brain, 
skeletal muscle, kidney, and pancreas. 

The signal peptide prediption program SIGNALP (Nielsen et al, 1997, Protein 
Engineering 10:1-6) predicted that human TANGO 272 includes an 20 amino acid signal 
peptide at amino acid 1 to about amino acid 20 of SEQ ID NO: 14 (SEQ ID NO: 1 12) 
preceding the mature human TANGO 272 protein which corresponds to about amino acid 
21 to amino acid 1050 of SEQ ID NO:14 (SEQ ID NO:l 13). 

Human TANGO 272 that has not been post-translationally modified is predicted to 
have a molecular weight of 1 12 kDa prior to cleavage of its signal peptide and a molecular 
weight of 1 10 kDa subsequent to cleavage of its signal peptide. 

Human TANGO 272 is a transmembrane protein having an extracellular domain 
which extends jfrom about amino acid 21 to about amino acid 767 of SEQ ID NO: 14 (SEQ 
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ID N0:1 14), a transmembrane domain which extends from about amino acid 768 to about 
amino acid 791 of SEQ ID N0:14 (SEQ ID N0:115), and a cytoplasmic domain which 
extends from about amino acid 792 to amino acid 1050 of SEQ ID NO: 14 (SEQ ID 
NO:116). 

Alternatively, in another embodiment, a human TANGO 272 protein contains an 
extracellular domain which extends from about amino acid 792 to amino acid 1050 of SEQ 
ID NO: 14 (SEQ ID N0:1 16), a transmembrane domain which extends Sx>m about amino 
acid 768 to about amino acid 791 of SEQ ID NO:14 (SEQ ID N0:1 15), and a cytoplasmic 
domain which extends from about amino acid 21 to about amino acid 767 of SEQ ID 
NO:14(SEQIDNO:114). 

Human TANGO 272 includes fourteen EGF-like domains at amino acids 1 5 1 - 1 8 1 of 
SEQ ID NO:14 (SEQ ID NO:49); amino acids 200-229 of SEQ ID NO: 14 (SEQ ID 
NO:50); amino acids 242-272 of SEQ ID N0:14 (SEQ ID N0:51); amino acids 285-315 of 
SEQ ID NO:14 (SEQ ID N0:52); amino acids 328-358 of SEQ ID N0:14 (SEQ ID 
NO:53); amino acids 378-404 of SEQ ID NO:14 (SEQ ID NO:54); amino acids 417-447 of 
SEQ ID N0:14 (SEQ ID NO:55); amino acids 460-490 of SEQ ID N0:14 (SEQ ID 
NO:56); amino acids 503-533 of SEQ ID NO:14 (SEQ ID NO:57); amino acids 546-576 of 
SEQ ID N0:14 (SEQ ID NO:58); amino acids 589-619 of SEQ ID N0:14 (SEQ ID 
NO:59); amino acids 632-661 of SEQ ID NO: 14 (SEQ ED NO:60); amino acids 674-704 of 
SEQ ID NO: 14 (SEQ ID N0:61); and amino acids 717-747 of SEQ ID NO: 14 (SEQ ID 
NO:62), Figures 15A-15C depict alignments of each of the EGF-like domains of TANGO 
272 with consensus hidden Markov model EGF-like domains (SEQ ID NO:46). Human 
TANGO 272 further includes a delta serrate ligand domain from amino acids 518 to 576 of 
SEQ ID NO: 14 (SEQ ID NO:63). An alignment of the delta senrate ligand domain of 
human TANGO 272 with a consensus hidden Markov model of this domain (SEQ ID 
NO:47) is also depicted (Figure 15B)» 

An RGD cell attachment site is present at amino acids 177-179 of SEQ ID NO: 14. 
N-glycosylation sites are present at amino acids 284-287, 405-408, 459-462, 489-492, 504- 
507, 588-591, 639-642, 647-650, 716-719, and 873-876 of SEQ ID N0:14, An amidation 
site is present at amino acids 628-631 of SEQ ID N0:14. Protein kinase C phosphorylation 
sites are present at amino acids 38-40, 70-72, 107-109, 359-361, 461-463, 594-596, 809- 
811, 896-898, 940-942, 977-979, and 1022-1024 of SEQ ID N0:14, Casein kinase II 
phosphorylation sites are present at amino acids 30-33, 38-41, 473-476, 548-551, 579-582, 
657-660, 897-900, 921-924, 940-943, and 955-958 of SEQ ID NO:14. A tyrosine kinase 
phosphorylation site is present at amino acids 361-368 of SEQ ID NO:14, N-myristylation 
sites are present at amino acids 14-19, 103-108, 269-274, 302-307, 325-330, 345-350, 401- 
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406, 427-432, 434-439, 457-462, 520-525, 586-591, 606-61 1, 648-653, 707-712, 714-719, 
866-871, 926-931, and 1014-1019 of SEQ ID N0:14. 

Clone jthcia089h03, which encodes human TANGO 272, was deposited as a 
composite deposit having a designation EpT272 with the American Type Culture Collection 
(ATCC® 10801 University Boulevard, Manassas, VA 201 10-2236) June 18, 1999 and 
^ assigned Accession Number PTA-250. A description of the deposit conditions used is set 
forth in the section entitled "Deposit of Clones" below. This deposit will be maintained 
under the terms of the Budapest Treaty on the International Recognition of the Deposit of 
Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a 
convenience for those of skill in the art and is not an admission that a deposit is required 
^0 under 35 U.S,C § 112. 

Figure 14 depicts a hydropathy plot of human TANGO 272. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 16 indicates the presence of a 
hydrophobic domain within human TANGO 272, suggesting that hxmian TANGO 272 is a 
transmembrane protein. 

A cDNA encoding mouse TANGO 272 was identified by analyzing the sequences of 
clones present in a mouse testis cDNA library. 

This analysis led to the identification of a clone, jtmzb062c04, encoding partial 
mouse TANGO 272. The cDNA of this clone is 2569 nucleotides long (Figures 16A-16B; 
SEQ ID NO: 16) . The 1492 nucleotide open reading frame of this cDNA, nucleotides 1- 
1492 of SEQ ID NO:16 (SEQ ED N0:18), encodes a 497 amino acid protein (Figures 16A- 
16B;SEQIDNO:17), 

Mouse TANGO 272 that has not been post-translationally modified is predicted to 
have a molecular weight of 53.5 kDa* 

Mouse TANGO 272 is a transmembrane protein having an extracellular domain 
which extends from about amino acid 1 to about amino acid 216 of SEQ ID NO: 17 (SEQ 
ID NO: 1 18), a transmembrane domain which extends from about amino acid 217 to about 
amino acid 240 of SEQ ID NO:17 (SEQ ID NO:l 19), and a cytoplasmic domain which 
extends from about amino acid 241 to amino acid 497 of SEQ ID NO: 17 (SEQ ID NO: 120). 

Alternatively, in another embodiment, a mouse TANGO 272 protein contains an 
extracellular domain which extends from about amino acid 241 to amino acid 497 of SEQ 
ID NO:17 (SEQ ID NO:120), a transmembrane domain which extends from about amino 
acid 217 to about amino acid 240 of SEQ ID NO: 17 (SEQ ID NO:l 19), and a cytoplasmic 
domain which extends from about amino acid 1 to about amino acid 216 of SEQ ID NO:17 
(SEQIDNO:118). 
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Mouse TANGO 212 includes four EGF-like domains at about amino acids 37-67 of 
SEQ ID NO:17 (SEQ ID NO:64); amino acids 80-1 10 of SEQ ID N0:17 (SEQ ID NO:65); 
amino acids 123-153 of SEQ ID N0:17 (SEQ ID NO:66); and amino acids 166-196 of SEQ 
ID NO: 17 (SEQ ID NO:67). Mouse TANGO 272 further includes four laminin-EGF-like 
domains at about amino acids 3-37 of SEQ ID N0:17 (SEQ ID NO:68); amino acids 41-80 
of SEQ ID NO:17 (SEQ ID NO:69); amino acids 83-123 of SEQ ID NO:17 (SEQ ID 
NO:70); and amino acids 127-172 of SEQ ID N0:17 (SEQ ID NO:71). Figures 39A-39B 
depict alignments of each of the EGF-like- and laminin-EGF-like domains of TANGO 272 
with consensus hidden Markov model EGF-like domains (SEQ ID NOs:46 and 48, 
respectively). 

Mouse TANGO 272 further includes a delta serrate ligand domain from amino acids 
10 to 67 of SEQ ID NO: 17 (SEQ ID NO:72), An alignment of the delta serrate ligand 
domain of mouse TANGO 272 with a consensus hidden Markov model of this domain 
(SEQ ID NO:47) is also depicted in Figures 39A-39B. 

Based on the Prosite analysis, EGF-like domain cysteine pattern signature are 
present at amino acids 13-24, 56-67, 99-1 10, 142-153, and 185-196 of SEQ ID NO:17. 

N-glycosylation sites are present at amino acids 36-39, 88-91, 165-168, and 323-326 
of SEQ ID NO:17. An amidation site is present at amino acids 76-79 of SEQ ID NO:17, 
Protein kinase C phosphorylation sites are present at amino acids 42-44, 258-260, 354-356, 
388-390, 469-471, and 492-494 of SEQ ED NO: 17. Casein kinase H phosphorylation sites 
are present at amino acids 106.109, 192-195, 343-346, 388-391, and 446-449 of SEQ ID 
NO: 17. N-myristylation sites are present at amino acids 1 1-16, 34-39, 47-52, 54-59, 97- 
102, 120-125, 140-145, 163-168, 199-204, 218-223, 372-377, and 461-466 of SEQ ID 
NO: 17. 

Figure 17 depicts a hydropathy plot of mouse TANGO 272. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 17 indicates the presence of a 
hydrophobic domain within mouse TANGO 272, suggesting that mouse TANGO 272 is a 
transmembrane protein. 

A cDNA encoding rat TANGO 272 was identified by analyzing the sequences of 
clones present in a rat neonatal sciatic nerve cDNA library. 

This analysis led to the identification of a clone, atrxa6b6, encoding partial rat 
TANGO 272. The cDNA of this clone is 3567 nucleotides long (Figures 33A-33C; SEQ ID 
NO: 19). The 1908 nucleotide open reading frame of this cDNA, nucleotides 925-2832 of 
SEQ ID NO: 19 (SEQ ID NO:21), encodes a 636 amino acid protein (Figures 33A-33C; 
SEQIDNO:20). 
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Rat TANGO 272 that has not been post-translationally modified is predicted to have 
a molecular weight of 67.4 kDa. 

Rat TANGO 272 is a transmembrane protein having an extracellular domain which 
extends from about amino acid 1 to about amino acid 500 of SEQ ID NO:20 (SEQ II> 
NO: 122), a transmembrane domain which extends from about amino acid 501 to about 
amino acid 524 of SEQ ID NO:20 (SEQ ID NO:123), and a cytoplasmic domain which 
extends from about amino acid 525 to amino acid 636 of SEQ ID NO:20 (SEQ ID NO:124). 

Alternatively, in another embodiment, a rat TANGO 272 protein contains an 
extracellular domain which extends from about amino acid 525 to amino acid 636 of SEQ 
ID NO:20 (SEQ ID NO: 124), a transmembrane domain which extends from about amino 
acid 501 to about amino acid 524 of SEQ ID NO:20 (SEQ ID NO: 123), and a cytoplasmic 
domain which extends from about amino acid 1 to about amino acid 500 of SEQ ID NO:20 
(SEQIDNO:122). 

Rat TANGO 272 includes eleven EGF-like domains at about amino acids 1 8-48 of 
SEQ ID NO:20 (SEQ ID NO:73); amino acids 61-91 of SEQ ID NO:20 (SEQ ID NO:74); 
amino acids 105-137 of SEQ ID NO:20 (SEQ ID NO:75); amino acids 150-180 of SEQ ID 
NO:20 (SEQ ID NO:76); amino acids 193-223 of SEQ ID NO:20 (SEQ ID NO:77); amino 
acids 236-266 of SEQ ID NO:20 (SEQ ID NO:78); amino acids 279-309 of SEQ ID NO:20 
(SEQ ID NO:79); amino acids 322-352 of SEQ ID NO:20 (SEQ ID NO:80); amino acids 
365-394 of SEQ ID NO:20 (SEQ ID N0:81); amino acids 407-437 of SEQ ID NO:20 (SEQ 
ID NO:82); and amino acids 450-480 of SEQ ID NO:20 (SEQ ID NO:83). Figures 41A- 
41D depict alignments of each of the EGF-like-domains of rat TANGO 272 with consensus 
hidden Markov model EGF-like domains (SEQ ID NO:46), 

Rat TANGO 272 further includes eleven laminin/EGF-like domains at about amino 
acids 22-61 of SEQ ID NO:20 (SEQ ID NO:84); amino acids 65-105 of SEQ ID NO:20 
(SEQ ID NO:85); amino acids 109-150 of SEQ ID NO:20 (SEQ ID NO:86); amino acids 
154-193 of SEQ ID NO:2a(SEQ ID NO:87); amino acids 197-236 of SEQ ID NO:20 (SEQ 
ID NO:88); amino acids 240-279 of SEQ ID NO:20 (SEQ ID NOt89); amino acids 283-322 
of SEQ ID NO:20 (SEQ ID NO:90); amino acids 326-365 of SEQ ID NO:20 (SEQ ID 
NO:91); amino acids 368-407 of SEQ ID NO:20 (SEQ ID NO:92); amino acids 41 1-450; 
and amino acids 454-489 of SEQ ID NO:20 (SEQ ID NO:93). Figures 41 A-41D depict 
alignments of each of the laminin/EGF-like-domains of rat TANGO 272 with consensus 
hidden Markov model EGF-like domains (SEQ ID NO:48). 

Rat TANGO 272 further iiicludes a delta serrate ligand domain from amino acids 
246 to 309 of SEQ ID NO:20 (SEQ ID NO:95). An alignment of the delta serrate ligand 
domain of rat TANGO 272 with a consensus hidden Markov model of this domain (SEQ ID 
NO:47) is also depicted in Figures 41 A-41D. 
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Based on the Prosite analysis, EGF-like domain cysteine pattern signature are 
present at amino acids 37-48, 80-91, 126-137, 169-180, 255-266, 298-309, 341-352, 383- 
394, 426-437, md 469-480 of SEQ ED NO:20. 

N-glycosylation sites are present at amino acids 17-20, 138-141, 192-195, 222-225, 
237-240, 321-324, 372-375, 436-439, and 449-452 of SEQ ID NO:20. A cAMP/cGMP- 
dependent protein kinase phosphorylation site is present at amino acids 618-621 of SEQ ID 
NO:20; An amidation site is present at amino acids 361-364 of SEQ ID NO:20. Protein 
kinase C phosphorylation sites are present at amino acids 92-94, 327-329, 542-544, and 
596-598 of SEQ ID NO:20* Casein kinase II phosphorylation sites are present at amino 
acids 104-107, 206-209, 281-284, and 390-393 of SEQ ID NO:20. A tyrosine kinase 
phosphorylation site is present at amino acids 94-101 of SEQ ID NO:20. N-myristylation 
sites are present at amino acids 2-7, 35-40, 58-63, 78-83, 134-139, 160-165, 167-172, 190- 
195, 210-215, 253-258, 319-324, 339-344, 381-386, 404-409, 424-429, 447-452, 483-488, 
and 502-507 of SEQ ID NO:20. 

Figure 40 depicts a hydropathy plot of rat TANGO 272. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 40 indicates the presence of a 
hydrophobic domain within rat TANGO 272, suggesting that rat TANGO 272 is a 
transmembrane protein. 

A global alignment between the nucleotide sequence of the open reading frame 
(ORF) of human TANGO 272 (SEQ ID NO: 15) and the nucleotide sequence of the open 
reading frame of mouse TANGO 272 (SEQ ID NO: 18) revealed a 39.1% identity (Figures 
30A-30E)* The global alignment was performed using the ALIGN program version 2.0u 
(Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global alignment score of 
-79; Myers and Miller, 1989, G4J?/054:ll-7). 

A local alignment between the nucleotide sequence of human TANGO 272 (SEQ ID 
N0:13) and the nucleotide sequence of mouse TANGO 272 (SEQ ID NO:16) revealed 67.6 
% identity over nucleotides 1890-4610 of the human TANGO 272 sequence (nucleotides 
10-2560 of mouse TANGO 272) (Figures 31A-31D). The local alignment was performed 
using the L- ALIGN program version 2.0u54 July 1996 (Matrix file used: pam 120.mat, gap 
penalties of -12/-4 with a score of 8462; Huang and Miller, 1991, ^rfv. AppL Math, 12:373- 
81). 

A global alignment between the amino acid sequence of himian TANGO 272 (SEQ 
ID NO:14) and the amino acid sequence of mouse TANGO 272 (SEQ ID N0:17) revealed a 
38.2% identity (Figures 32A-32B). The global alignment was performed using the ALIGN 
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program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of -19; Myers and Miller, 1989, CABIOS 4:1U7). 

A global alignment between the nucleotide sequence of human TANGO 272 (SEQ 
ID NO: 13) and the nucleotide sequence of rat TANGO 272 (SEQ ID NO: 19) revealed a 
55.7% identity (Figures 34A-34H). The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of 8635; Myers and Miller, 1989, Gie/OiS'4:ll-7). 

A global alignment between the nucleotide sequence of mouse TANGO 272 (SEQ 
TD NO: 16) and the nucleotide sequence of rat TANGO 272 (SEQ ID NO: 19) revealed a 
43.7% identity (Figures 35A-35F). The global alignment was performed using the ALIGN 
program version 2,0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of 2827; Myers andMiller, 1989, C45/0^ 4:11-7). 



Use of TANGO 272 Nucleic Acids. Polypeptides, and Modulators Thereof 

TANGO 272 includes fourteen EGF4ike domains. Proteins having such domains 
play a role in mediating protein-protein interactions, and thus can influence a wide variety 
of biological processes, including cell surface recognition; modulation of cell-cell contact; 
modulation of cell fate determination; and modulation of wound healing and tissue repair. 

TANGO 272 further includes an RGD cell attachment site. Proteins having such 
domains are typically extracellular matrix proteins such as collagens, laminin and 
fibronectin, among others (reviewed in Ruoslahti, 1996, Annu. Rev. Cell Dev. BioL 12:697- 
715). An RGD cell attachment site typically interacts (e.g., binds to) a cell surface receptor, 
such as an integrin receptor, and thus mediates a variety of biological processes, including 
cellular adhesion, migration, among others. 

Because TANGO 272 includes EGF-like domains and an RGD cell attachment site, 

25 

TANGO 272 polypeptides, nucleic acids, and modulators thereof can be used to treat 
disorders involving, eg., cellular migration, proliferation, and differentiation of a cell. For 
example, TANGO 272 polypeptides, nucleic acids, and modulators thereof can be used to 
treat neoplastic disorders, e.g., cancer, tumor metastasis. 

TANGO 272 polypeptides, nucleic acids, and modulators thereof can be used to 
modulate function, survival, morphology, migration, proliferation, tissue repair and/or 
differentiation of cells in the tissues in which it is expressed (e.g., microvascular endothelial 
cells). For example, TANGO 272 polypeptides, nucleic acids, and modulators thereof can 
be used to modulate cardiovascular function, and/or to promote woxmd healing and tissue 
repair (e.g., of the skin, cornea and mucosal lining). Accordingly, these molecules can be 
used to treat a variety of cardiovascular diseases including, but not limited to, 
atherosclerosis, ischemia reperfusion injury, cardiac hypertrophy, hypertension, coronary 
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artery disease, myocardial infarction, arrhythmia, cardiomyopathies, and congestive heart 
failure. 

As TANGO 272 exhibits expression in the heart, TANGO 272 nucleic acids, 
proteins, and modulators thereof can be used to treat heart disorders, e.g,^ ischemic heart 
disease, atherosclerosis, hypertension, angina pectoris. Hypertrophic Cardiomyopathy, and 
^ congenital heart disease. 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat placental disorders, such as toxemia of pregnancy (e.g,, preeclampsia 
and eclampsia), placentitis, or spontaneous abortion. 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat pulmonary (lung) disorders, such as atelectasis, cystic fibrosis, 
rheumatoid lung disease, pulmonary congestion or edema, chronic obstructive airway 
disease (eg., emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis), diffuse 
interstitial diseases (e.g,, sarcoidosis, pneumoconiosis, hypersensitivity pneumonitis, 
Goodpasture's syndrome, idiopathic pulmonary hemosiderosis, pulmonary alveolar 
proteinosis, desquamative interstitial pneumonitis, chronic interstitial pneumonia, fibrosing 
alveolitis, hamman-rich syndrome, puhnonary eosinophilia, diffuse interstitial fibrosis, 
Wegener's granulomatosis, lymphomatoid granulomatosis, and lipid pneumonia), or tumors 
(e.g., bronchogenic carcinoma, bronchiole vlveolar carcinoma, bronchial carcinoid, 
hamartoma, and mesenchymal tumors). 

20 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat hepatic (liver) disorders, such as jaundice, hepatic failure, hereditary 
hyperbiliruinemias (e,g.y Gilbert's syndrome, Crigler-Naijar syndromes and Dubin-Johnson 
and Rotor's syndromes), hepatic circulatory disorders (e.g., hepatic vein thrombosis and 
portal vein obstruction and thrombosis) hepatitis (e.g., chronic active hepatitis, acute viral 

25 

hepatitis, and toxic and drug-induced hepatitis) cirrhosis (e.g., alcoholic cirrhosis, biliary 
cirrhosis, and hemochromatosis), or malignant tumors (e.g., primary carcinoma, 
hepatoblastoma, and angiosarcoma). 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat disorders of the brain, such as cerebral edema, hydrocephalus, brain 

30 

herniations, iatrogenic disease (due to, e.g., infection, toxins, or drugs), inflammations (e.g., 
bacterial and viral meningitis, encephalitis, and cerebral toxoplasmosis), cerebrovascular 
diseases (e.g., hypoxia, ischemia, and infarction, intracranial hemorrhage and vascular 
malformations, and hypertensive encephalopathy), and tumors (e.g., neuroglial tumors, 
neuronal tumors, tumors of pineal cells, meningeal tumors, primary and secondary 

35 

lymphomas, intracranial tumors, and meduUoblastoma), and to treat injury or trauma to the 
brain. 
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In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat disorders of skeletal muscle, such as muscular dystrophy (eg., 

Duchenne Muscular Dystrophy, Becker Muscular Dystrophy, Emery-Dreifuss Muscular 
Dystrophy, Limb-Girdle Muscular Dystrophy, Facioscapulohumeral Muscular Dystrophy, 
Myotonic Dystrophy, Oculopharyngeal Muscular Dystrophy, Distal Muscular Dystrophy, 
and Congenital Muscular Dystrophy), motor neuron diseases (e.g., Amyotrophic Lateral 
Sclerosis, Infantile Progressive Spinal Muscular Atrophy, Intermediate Spinal Muscular 
Atrophy, Spinal Bulbar Muscular Atrophy, and Adult Spinal Muscular Atrophy), 
myopathies (e.g., inflammatory myopathies (e.g, Dermatomyositis and Polymyositis), 
Myotonia Congenita, Paramyotonia Congenita, Central Core Disease, Nemaline Myopathy, 
Myotubular Myopathy, and Periodic Paralysis), and metabolic diseases of muscle (e.g, 
Phosphorylase Deficiency, Acid Mahase Deficiency, Phosphofructokinase Deficiency, 
Debrancher Enzyme Deficiency, Mitochondrial Myopathy, Carnitine Deficiency, Carnitine 
Pahnityl Transferase Deficiency, Phosphoglycerate Kinase Deficiency, Phosphoglycerate 
Mutase Deficiency, Lactate Dehydrogenase Deficiency, and Myoadenylate Deaminase 
Deficiency). 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat renal disorders, such as glomerular diseases (e.g., acute and chronic 
glomerulonephritis, rapidly progressive glomerulonephritis, nephrotic syndrome, focal 
proliferative glomerulonephritis, glomerular lesions associated with systemic disease, such 
as systemic lupus erythematosus, Goodpasture's syndrome, multiple myeloma, diabetes, 
neoplasia, sickle cell disease, and chronic inflammatory diseases), tubular diseases (e.g.^ 
acute tubular necrosis and acute renal failure, polycystic renal diseasemeduUary sponge 
kidney, medullary cystic disease, nephrogenic diabetes, and renal tubular acidosis), 
tubulointerstitial diseases (e.g., pyelonephritis, drug and toxin induced tubulointerstitial 
nephritis, hypercalcemic nephropathy, and hypokalemic nephropathy) acute and rapidly 
progressive renal failure, chronic renal failure, nephrolithiasis, vascular diseases (e.g., 
hypertension and nephrosclerosis, microangiopathic hemolytic anemia, atheroembolic renal 
diseiase, diffuse cortical necrosis, and renal infarcts), or tumors (e,g,, renal cell carcinoma 
and nephroblastoma). 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat pancreatic disorders, such as pancreatitis (e.g., acute hemorrhagic 
pancreatitis and chronic pancreatitis), pancreatic cysts (e.g., congenital cysts, pseudocysts, 
and benign or malignant neoplastic cysts), pancreatic tumors (e.g., pancreatic carcinoma and 
adenoma), diabetes mellitus (e.g., insulin- and non-insulin-dependent types, impaired 
glucose tolerance, and gestational diabetes), or islet cell tumors (e.g., insulinomas, 
adenomas, ZoUinger-EUison syndrome, glucagonomas, and somatostatinoma). 
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Further, in light of TANGO 272*s pattern of expression in humans, TANGO 272 
expression can be utilized as a marker for specific tissues (e.^., cardiovascular) and/or cells 
(e.g., cardiac) in which TANGO 272 is expressed TANGO 272 nucleic acids can also be 
utilized for chromosomal mapping. 

^ TANGO 295 

A cDNA encoding human TANGO 295 was identified by analyzing the sequences 
of clones present in a human mammary epithelium cDNA library. 

This analysis led to the identification of a clone, jthvb023d09, encoding full-length 
human TANGO 295. The cDNA of this clone is 1497 nucleotides long (Figure 18; SEQ ID 
NO:22). The 468 nucleotide open reading frame of this cDNA, nucleotides 217-684 of 
SEQ ID NO:22 (SEQ ID NO:34), encodes a 156 amino acid protein (Figure 18; SEQ ID 
NO:23). 

The signal peptide prediction program SIGNALP (Nielsen et al, 1997* Protein 
Engineering 10:1-6) predicted that human TANGO 295 includes a 28 amino acid signal 
peptide at amino acid 1 to about amino acid 28 of SEQ ID NO:23 (SEQ ID NO: 125) 
preceding the mature human TANGO 295 protein which corresponds to about amino acid 
29 to amino acid 156 of SEQ ID NO:23 (SEQ ID NO:126). 

Himian TANGO 295 that has not been post-translationally modified is predicted to 
have a molecular weight of 17.5 kDa prior to cleavage of its signal peptide and a molecular 
weight of 14.6 kDa subsequent to cleavage of its signal peptide. 

Secretion assays reveal that human TANGO 295 protein is secreted as a 17 kDa 
protein. The secretion assays were performed as follows: 8x10^ 293T cells were plated per 
well in a 6-well plate and the cells were incubated in growth medium (DMEM, 10% fetal 
bovine serum, penicillin/streptomycin) at 37^C, 5% CO2 overnight. 293T cells were 
transfected with 2 ^g of full-length MANGO 245 inserted in the pMET7 vector/well and 10 
|iig LipofectAMINE (GIBCO/BRL Cat. # 18324-012) /well according to the protocol for 
GIBCO/BRL LipofectAMINE. The transfectant was removed 5 hours later and fresh 
growth medium was added to allow the cells to recover overnight. The medium was 
removed and each well was gently washed twice with DMEM without methionine and 
cysteine (ICN Cat. # 16-424-54). 1 ml DMEM without methionine and cysteine with 50 
^iCi Trans-^^S (ICN Cat. # 51006) was added to each well and the cells were incubated at 
37 °C, 5% CO2 for the appropriate time period. A 150 fil aliquot of conditioned medium 
was obtained and 150 fil of 2X SDS sample buffer was added to the aliquot. The sample 
was heat-inactivated and loaded on a 4-20% SDS-PAGE gel. The gel was fixed and the 
presence of secreted protein was detected by autoradiography. 
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Human TANGO 295 includes a pancreatic ribonuclease domain at amino acids 32- 
1 56 of SEQ ID NO:23 (SEQ ID NO:97). Figure 20 depicts an alignment of pancreatic 
ribonuclease domain of human TANGO 295 with a consensus hidden Markov model 
pancreatic ribonuclease domain (SEQ ID NO:96). 

An N-glycosylation site is present at amino acids 127-130 of SEQ ID NO:23. A 
^ cAMP/cGMP dependent protein kinase site is present at amino acids 139-142 of SEQ ID 
NO:23. Protein kinase C phosphorylation sites are present at amino acids 27-29, 62-64, 85- 
87, and 113-115 of SEQ IDNO:23. N-myristylation sites are present at amino acids 18-23, 
and 32-37 of SEQ ID NO:23. 

Global alignment of the human TANGO 295 and GenPept AF037081 amino acid 
sequences revealed 53.2% identity (Matrix file used: pam 120.mat, gap penalties of -12A4; 
Myers and Miller, 1989, 4:1 1-7) (Figure 36). A global alignment of the human 

TANGO 295 and GenPept AF037081 nucleotide sequences revealed a 22,6% identity 
between these two sequences (Figures 37A-37C) (Matrix file used: pam 120.mat, gap 
penalties of - 1 2/-4 with a global alignment score of -27 1 8 ; Myers and Miller, 1 989, 
1^ CABIOS 4:11-7), 

Local alignment of the human TANGO 295 and Genbank AF037081 nucleotide 
sequences revealed 62,7% identity between nucleotides 235-687 of human TANGO 295, 
and nucleotides 3-453 of AF037081; 43.4% identity between nucleotides 410-850 of human 
TANGO 295, and nucleotides 3-450 of AF037081; and 46.5% identity between nucleotides 
432-700 of human TANGO 295, and nucleotides 5-251 of AF037081 (Matrix file used: 
pam i20.mat, gap penalties of -12/-4 with a global alignment score of 1214; Huang and 
. Miller, 1991, Adv, Appl. Math. 12:373-81) (Figures 38A-38B). 

Clone jthvb023d09, which encodes human TANGO 295, was deposited as a 
composite deposit having a designation EpT295 with the American Type Culture Collection 
2^ (ATCC® 10801 University Boulevard, Manassas, VA 201 10-2209) on June 18, 1999 and 
assigned Accession Number PTA-249. Deposit conditions are described below in the 
section entitled "Deposit of Clones". This deposit will be maintained under the terms of the 
Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the 
Purposes of Patent Procedure. This deposit was made merely as a convenience for those of 

30 

skill in the art and is not an admission that a deposit is required under 35 U.S.C. § 1 12. 

Figure 19 depicts a hydropathy plot of human TANGO 295. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 19 indicates that human TANGO 295 
has a signal peptide at its amino terminus, suggesting that human TANGO 295 is a secreted 
protein. 
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Use of TANGO 295 Nucleic Acids, Polypeptides, and Modulators Thereof 

TANGO 295 includes a pancreatic ribonuclease domain. Proteins having such 
domains have pyrimidine-specific endonuclease activity, and are present at elevated levels 
in the pancreas of various mammals and few reptiles. TANGO 295 shows some structural 
similarities to Ribonuclease k6 (RNase k6). RNase k6 is expressed in human monocytes 
and monophils (but not in eosinophils), suggesting a role for this ribonuclease in regulating 
host defense. Based on the structural similarities between TANGO 295 and RNase k6, 
TANGO 295 may play a role in regulating host defense* 

TANGO 295 polypeptides, nucleic acids, and modulators thereof, can be used to 
modulate the function, morphology, proliferation and/or differentiation of cells in the 
tissues in which it is expressed (e.g., mammary epithelium). Accordingly, TANGO 295 
polypeptides, nucleic acids, and modulators thereof can be used to treat epithelial disorders, 
e.g., mammary epithelial disorders (eg., breast cancer). 

Further, in light of TANGO 295's presence in a human mamary epithelium cDNA 
library^ TANGO 295 expression can be utilized as a marker for specific tissues (e.g., breast) 
and/or cells (e.g., mammary) in which TANGO 295 is expressed. TANGO 295 nucleic 
acids can also be utilized for chromosomal mapping. 

20 

TANGO 354 

A cDNA encoding human TANGO 354 was identified by analyzing the sequences 
of clones present in a Mixed Lymphocyte Reaction (MLR) cDNA library. 

This analysis led to the identification of a clone, jthLa042a04, encoding full-length 
2^ human TANGO 354. The cDNA of this clone is 1788 nucleotides long (Figures 21A-21B; 
SEQ ID NO:25). The 915 nucleotide open reading frame of this cDNA, nucleotides 62-976 
of SEQ ID NO:25 (SEQ ID Nb:27), encodes a 305 amino acid protein (Figures 21A-21B; 
SEQIDNO:26). 

Human TANGO 354 that has not been post-translationally modified is predicted to 
have a molecular weight of 33 .8 kDa prior to cleavage of its signal peptide (31.6 kDa after 
cleavage of its signal peptide). 

The signal peptide prediction program SIGNAL? (Nielsen et al., 1997, Protein 
Engineering 10:1-6) predicted that human TANGO 354 includes a 19 amino acid signal 
peptide at amino acid 1 to about amino acid 19 of SEQ ID NO:26 (SEQ ID NO:127) 
preceding the mature human TANGO 354 protein which corresponds to about amino acid 
20 to amino acid 305 of SEQ ID NO:26 (SEQ ID NO: 128). 
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Human TANGO 354 is a transmembrane protein haying an extracellular domain 
which extends from about amino acid 20 to about amino acid 169 of SEQ ID NO:26 (SEQ 
ID NO: 129), a transmembrane domain which extends from about amino acid 170 to about 
amino acid 193 of SEQ ID NO:26 (SEQ ID NO: 130), and a cytoplasmic domain which 
extends from about amino acid 194 to amino acid 305 of SEQ ID NO:26 (SEQ ID N0:13 1). 
^ Alternatively, in another embodiment, a human TANGO 354 protein contains an 

extracellular domain which extends from about amino acid 194 to amino acid 305 of SEQ 
ID N0:26 (SEQ ID NO: 131), a transmembrane domain which extends from about amino 
acid 170 to about amino acid 193 of SEQ ID NO:26 (SEQ ED NO: 130), and a cytoplasmic 
domain which extends from about amino acid 20 to about amino acid 169 of SEQ ID 
NO:26(SEQIDNO:129). 

Human TANGO 354 includes an immunoglobulin domain at amino acids 334 10 of 
SEQ ID NO:26 (SEQ ID N0:41). Figure 23 depicts alignments of the immunoglobulin 
domains of TANGO 354 with consensus hidden Markov model immunoglobulin domains 
(SEQIDN0;37). 

^ ^ An N-glycosylation site is present at amino acids 88-91 of SEQ ID NO:26. A 

cAMP and cGMP-dependent protein kinase phosphorylation site is present at amino acids 
233-236 of SEQ ID NO:26. Protein kinase C phosphorylation sites are present at amino 
acids 81-83, 231-233, and 236-238 of SEQ ID NO:26. Casein kinase II phosphorylation 
sites are present at amino acids 44-47, 69-72, 81-84, 94-97, 101-104, 113-1 16, and 146-149 
of SEQ ID NO:26. A tyrosine kinase phosphorylation site is present at amino acids 291- 
299 of SEQ ID NO:26* N-myristylation sites are present at amino acids 30-35, and 109-1 14 
ofSEQIDNO:26. 

Clone jthLa042a04, which encodes human TANGO 354, was deposited as EpT354 
with the American Type Culture Collection (ATCC® 10801 University Boulevard, 
Manassas, VA 201 10-2209) on June 18, 1999 and assigned Accession Number PTA-249. 
This deposit will be maintained under the terms of the Budapest Treaty on the International 
Recognition of the Deposit of Microorganismis for the Pvuposes of Patent Procedure. This 
deposit was made merely as a convenience for those of sldll in the art and is not an 
admission that a deposit is required under 35 U.S.C. §112. 

Figure 22 depicts a hydropathy plot of human TANGO 354* Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophiHc regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 22 indicates the presence of a 
hydrophobic domain within human TANGO 354, suggesting that human TANGO 354 is a 

35 

transmembrane protein. 
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Use of TANGO 354 Nucleic Acids. Polypeptides^ and Modulators Thereof 

TANGO 354 includes an immimoglobulin-like domain. Proteins having such 
domains play a role in mediating protein-protein and protein-ligand interactions, and thus 
can influence a wide variety of biological processes, including modulation of cell surface 
recognition; modulation of cellular motility, e.g., chemotaxis and chemokinesis; 
^ transduction of an extracellular signal (e.g,, by interacting with a ligand and/or a cell- 
surface receptor); and/or modulation of a signal transduction pathways. 

TANGO 354 polypeptides, nucleic acids, and modulators thereof can be used to 
modulate function, survival, morphology, migration, proliferation and/or differentiation of 
cells in the tissues in which it is expressed (e.g., hematopoietic tissues). 

Because of the presence of an immunoglobulin domain and the expression of 
TANGO 354 in hematopoietic cells, TANGO 354 polypeptides, nucleic acids, and 
modulators thereof can be used to modulate (e.g., increase or decrease) hematopoietic 
function, thereby influencing one or more of: (1) regulation of hematopoiesis; (2) 
modulation of haemostasis; (3) modulation of an inflammatory response; (4) modulation of 
neoplastic growth, e,g, inhibition of tumor growth; and/or (5) regulation of thrombolysis. 

Accordingly, TANGO 354 polypeptides, nucleic acids, and modulators thereof can 
be used to treat a variety of hematopoietic diseases including, but not limited to, myeloid 
disorders and/or lymphoid malignancies. Exemplary myeloid diseases that can be treated 
include acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and 
chronic myelogenous leukemia (CML) (reviewed in Vaickus, 1991, Crit Rev. in 
OncoL/HemotoL 11 :267-97). Exemplary lymphoid malignancies that can be treated using 
these molecules include acute lymphoblastic leukemia (ALL) which includes B-lineage 
ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia 
(PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). 
Additional forms of malignant lymphomas include non-Hodgkin lymphoma and variants 
thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T- 
cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF) and Hodgkin's disease. 

In one embodiment, TANGO 354 polypeptides, nucleic acids, and modulators 
thereof can be used to treat a variety of neoplastic diseases, including malignancies of the 

30 

various organ systems, such as affecting lung, breast, lymphoid, gastrointestinal, and 
genito-urinary tract, as well as adenocarcinomas which include malignancies such as most 
colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell 
carcinoma of the lung, cancer of the small intestine and cancer of the esophagus. 

The term "carcinoma" is art recognized and refers to malignancies of epithelial or 
endocrine tissues including respiratory system carcinomas, gastrointestinal system 
carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, 
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prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas 
include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon 
and ovary. The term also includes carcinosarcomas, e.g^, which include malignant tumors 
composed of carcinomatous and sarcomatous tissues. An "adenocarcinoma" refers to a 
carcinoma derived from glandular tissue or in which the tumor cells form recognizable 
glandular structures. The term "sarcoma" is art recognized and refers to malignant tumors 
of mesenchymal derivation. 

TANGO 354 polypeptides, nucleic acids, and modulators thereof can also be used to 
treat a variety of non-cancerous diseases or conditions involving, for example, aberrant T 
cell activity (e^,, aberrant T cell proliferation and/or secretion). Examples of such T cell 
diseases or conditions include inflammation; allergy, for example, atopic allergy; organ 
rejection after transplantation (e.g., skin graft, cardiac graft, islet graft); graft- versus-host 
disease; autoimmune diseases (including, for example, diabetes mellitus, arthritis (including 
rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), 
multiple sclerosis, encephalomyelitis, diabetes, myasthenia gravis, systemic lupus 
erythematosus, autoimmune thyroiditis, dermatitis (including atopic dermatitis and 
eczematous dermatitis), psoriasis, Sjagren*s Syndrome, including keratoconjunctivitis sicca 
secondary to Sjogren*s Syndrome, alopecia areata, allergic responses due to arthropod bite 
reactions, Crohn^s disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, 
ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, 
vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, 
autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic 
encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, 
pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's 
granulomatosis, chronic active hepatitis, Stevens- Johnson syndrome, idiopathic sprue, 
lichen planus, Crohn*s disease, Graves ophthalmopathy, sarcoidosis, primary biliary 
cirrhosis, uveitis posterior, and interstitial lung fibrosis). 

Further, in light of TANGO 345*s presence in a Mixed Lymphocyte Reaction cDNA 
library, TANGO 345 expression can be utilized as a marker for specific tissues (e.g.^ 
lymphoid tissues such as the thymus and spleen) and/or cells (e,g, , lymphocytes) in which 
TANGO 345 is expressed. TANGO 345 nucleic acids can also be utilized for chromosomal 
mapping. 



20 
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TANGO 378 

A cDNA encoding human TANGO 378 was identified by analyzing the sequences 
of clones present in a human natural killer cell cDNA library. 
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This analysis led to the identification of a clone, jthta028fD4, encoding full-length 
human TANGO 378. The cDNA of this clone is 3258 nucleotides long (Figures 24A-24C; 
SEQ ID NO:28). The 1 584 nucleotide open reading frame of this cDNA, nucleotides 42 to 
1625 of SEQ ID NO:28 (SEQ ID NO:30), encodes a 528 amino acid protein (Figure 25; 
SEQIDNO:29). 

The signal peptide prediction program SIGNALP (Nielsen et al, 1997, Protein 
Engineering 10:1-6) predicted that human TANGO 378 includes a 21 amino acid signal 
peptide at amino acid 1 to about amino acid 21 of SEQ ID NO:29 (SEQ ID NO:132) 
preceding the mature human MANGO 347 protein which corresponds to about amino acid 
22 to amino acid 528 of SEQ ID NO:29 (SEQ ID NO:133)» 

Human TANGO 378 that has not been post-translationally modified is predicted to 
have a molecular weight of 59.0 kDa prior to cleavage of its signal peptide and a molecular 
weight of 56*7 kDa subsequent to cleavage of its signal peptide. 

Human TANGO 378 is a seven transmembrane G-protein coupled receptor (GPCR) 
protein having an N-terminal extracellular domain which extends from about amino acid 22 
to about amino acid 244 of SEQ ID NO:29 (SEQ ID NO: 1 34); seven transmembrane 
domains which extend from about amino acids 245 to about amino acid 269 of SEQ ID 
NO:29 (SEQ ID NO: 135), about amino acids 287 to about amino acid 306 of SEQ ID 
NO:29 (SEQ ID NO: 136), about amino acids 323 to about amino acid 343 of SEQ ID 
NO:29 (SEQ ID NO:137), about amino acids 358 to about amino acid 376 of SEQ ID 
NO:29 (SEQ ID NO:138), about amino acids 414 to about amino acid 438 of SEQ ID 
NO:29 (SEQ ID NO: 139), about amino acids 457 to about amino acid 477 of SEQ ID 
NO:29 (SEQ ID NO: 140), and about amino acids 485 to about amino acid 504 of SEQ ID 
NO:29 (SEQ ID NO: 141); and a C-terminal cj^oplasmic domain which extends from about 
amino acid 505 to amino acid 528 of SEQ ED NO:29 (SEQ ID NO: 142). Figure 26 depicts 
an aUgimient of each of the transmembrane domains of TANGO 378 with the consensus 
hidden Markov model seven transmembrane recqptor sequences (SEQ ID NO:98). 

Alternatively, in another embodiment, a human TANGO 378 protein contains an N- 
terminal extracellular domain which extends from about amino acid 505 to amino acid 528 
of SEQ ID NO:29 (SEQ ID NO: 142); seven transmembrane domains which extend from 
about amino acids 245 to about amino acid 269 of SEQ ID NO:29 (SEQ ID NO: 135), about 
amino acids 287 to about amino acid 306 of SEQ ID NO:29 (SEQ ID NO: 136), about 
amino acids 323 to about amino acid 343 of SEQ ID NO:29 (SEQ ID NO: 137), about 
amino acids 358 to about amino acid 376 of SEQ ID NO:29 (SEQ ID NO: 138), about 
amino acids 414 to about amino acid 438 of SEQ ID NO:29 (SEQ ID NO: 139), about 
amino acids 457 to about amino acid 477 of SEQ ID NO:29 (SEQ ID NO: 140), and about 
amino acids 485 to about amino acid 504 of SEQ ID NO:29 (SEQ ID NO: 141); and a C- 
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terminal cytoplasmic domain which extends from about amino acid 22 to about amino acid 
244 of SEQ ID NO:29 (SEQ ro NO:134). 

Human TANGO 378 includes three extracellular loops which extend from about 
amino acid 307 to about amino acid 322 of SEQ ID NO:29 (SEQ ID NO:143), about amino 
acid 377 to about amino acid 413 of SEQ ID NO:29 (SEQ ID NO: 144), and about amino 
acid 478 to about amino acid 484 of SEQ ID NO:29 (SEQ ID NO: 145). 

Human TANGO 378 includes three intracellular loops which extend from about 
amino acid 270 to about amino acid 286 of SEQ ID NO:29 (SEQ ID NO:146), about amino 
acid 344 to about amino acid 357 of SEQ ED NO:29 (SEQ ED NO:147), and about amino 
acid 439 to about amino acid 456 of SEQ ED NO:29 (SEQ ED NO:148). 

N-glycosylation sites are present at amino acids 18-21, 58-61, 65-68, 146-149, 173- 
176, 179-182, 394-397, and 400-403 of SEQ ID NO:29. A cAMP and cGMP-dependent 
protein kinase phosphorylation site is present at amino acids 274-277 of SEQ ED NO:29. 
Protein kinase C phosphorylation sites are present at amino acids 45-47, 93-95, 375-377, 
437-439, 449-451, and 505-507 of SEQ ID NO:29. Casein kinase II phosphorylation sites 
are present at amino acids 23-26, 29-32, and 510-513 of SEQ ID NO:29. N-myristylation 
sites are present at amino acids 86-91, 101-106, 157-162, 255-260, 311-316, 420-425, and 
467-472 of SEQ ID NO:29. A thiol (cysteine) protease histidine site is present at amino 
acid 410-420 of SEQ ED NO:29. 

Clone jthta028f04, which encodes human TANGO 378, was deposited as EpT378 
with the American Type Culture Collection (ATCC* 10801 University Boulevard, 
Manassas, VA 201 10-2209) on June 18, 1999 and assigned Accession Number PTA-249. 
This deposit will be maintained under the terms of the Budapest Treaty on the International 
Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This 
deposit was made merely as a convenience for those of skill in the art and is not an 
admission that a deposit is required under 35 U.S.C. §112. 

Figure 25 depicts a hydropathy plot ofhuman TANGO 378. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 25 indicates that human TANGO 378 
has a signal peptide at its amino terminus and seven hydrophobic domains within human 
TANGO 378, suggesting that human TANGO 378 is a transmembrane protein. 

Use of TANGO 378 Nucleic Acids. Polyp eptides, and Modulators Thereof 

TANGO 378 includes a seven transmembrane domain which is typically found in 
G-protein coupled receptors. Proteins having such a domain play a role in transducing an 
extracellular signal, e^., by interacting with a ligand and/or a cell-surface receptor, 
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followed by mobilization of intracellular molecules that participate in signal transduction 
pathways (e.g., adenylate cyclase, or phosphatidylinositol 4,5-bisphosphate (PIPj), inositol 
1 ,4,5-triphosphate (IP3)). 

TANGO 378 polypeptides, nucleic acids, and modulators thereof can be used to 
modulate function, survival, morphology, migration^ proliferation and/or differentiation of 
cells in the tissues in which it is expressed {e,g.^ natural killer cells). For example, TANGO 
354 polypeptides, nucleic acids, and modulators thereof can be used to modulate an immune 
response in a subject by, for example, (1) modulating immune cytotoxic responses against 
pathogenic organisms, eg^., virases, bacteria, and parasites; (2) by modulating organ 
rejection after transplantation {e.g., skin graft, cardiac graft, islet graft); (3) by modulating 
immune recognition and lysis of normal and malignant cells; (4) by modulating T cell 
diseases; and (5) by controlling neoplastic grov^h, e.^. , inhibition of tumor growth. 

Accordingly, TANGO 378 polypeptides, nucleic acids, and modulators thereof can 
be used to treat a variety of diseases involving aberrant immune responses, for example, 
aberrant T cell activity (e.^., aberrant T cell proliferation and/or secretion). A non-limiting 
list of diseases involving aberrant T cell activity is provided in the section entitled 
"TANGO 354" above. 

In other embodiments, TANGO 378 polypeptides, nucleic acids, and modulators 
thereof can be used to treat a variety of neoplastic diseases, including hematopoietic 
malignancies and including, but not limited to, myeloid disorders, lymphoid malignancies, 
and/or malignancies of the various organ systems. ). A non-limiting list of such neoplastic 
diseases is provided in the section entitled "TANGO 354" above. 

Further, in light of TANGO 378*s presence in a Natral Killer cell cDNA library, 
TANGO 378 expression can be utihzed as a marker for specific tissues (e.g, lymphoid 
tissues such as the thymus and spleen) and/or cells (e.g., Natural Killer cells) in which 
TANGO 345 is expressed. TANGO 345 nucleic acids can also be utilized for chromosomal 
mapping. 
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Tables 1 and 2 below provide summaries of INTERCEPT 340, MANGO 003, 
MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 sequence 
infonnation. 



TABLE 1 : Summary of Sequence Information for INTERCEPT 340, MANGO 003, 

MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 



10 



15 



20 



25 



Gene 


cDNA 


ORF 


Polypeptide 


Figure 


ATCC® 

Accession 

Number 


nsiTERCEI*T340 
human 


SEQ ID N0:1 


SEQ ID N0:3 


SEQ ID NO:2 


Figs. lA-lB 


PTA-250 


MANGO 003 
human 


SEQ ID NO:4 


SEQ ID N0:6 


SEQ ID N0:5 


Figs. 4A-4C 


207178 


MANGO 003 
mouse 


SEQIDN0:7 


SEQIDN0:9 


SEQ 1DN0:8 


Fig. 8 




MANGO 347 

human 


SEQ ID NO: 10 


SEQ ID NO: 1 2 


SEQ ID NO: 11 


Fig. 10 


PTA-250 


TANGO 272 
human 


SEQ ID NO: 13 


SEQ ID NO: 15 


SEQ ID NO: 14 


Figs. 13A.13D 


PTA.250 


TANGO 272 
mouse 


SEQ ID NO: 16 


SEQ ID NO: 18 


SEQ ID NO: 17 


Figs, 16A-16B 




TANGO 272 
rat 


SEQiPNO:19 


SEQ ID NO:2l 


SEQIDNO:20 


Figs.33A-33C 




TANGO 295 
human 


SEQ ID NO:22 


SEQ ID NO:24 


SEQlDNO:23 


Fig. 18 


PTA-249 


TANGO 354 
human 


SEQ ID NO:25 


SEQ ID NO:27 


SEQ IDNO:26 


Figs.21A-21B 


PTA-249 


TANGO 378 
human 


SEQ ID NO:28 


SEQ ID NO:30 


SEQIDNO:29 


Figs. 24A*24C 


PTA.249 
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TABLE 2: Summary of Protein Domains of INTERCEPT 340, MANGO 003, 

MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 







Peptide 


Protein 


P!ytra£i?IIuldi* 
Domain 


nr 9 n 4 m ft nrt h v* ik n A 

X I mIOIJICIIIUa iSllV 

Domain 


Domain 


5 


INTERCEPT 340 

human 














MANGO 003 
human 


AA 1-24 of 
SEQ ID NO:5 
SEQ ID NO: 101 


AA 25-504 of 
SEQ ID NO:5 
SEQ ID NOil02 


AA 25-374 of 
SEQ ID N0:5 
SEQIDNO:103 


AA 375-398 of 
SEQ ID NO:5 
SEQ ID NO: 104 


AA 399-504 of 
SEQ ID NO:5 
SEQ ID NO: 105 


10 


MANGO 003 
mouse 




AA 1-208 of 
SEQ ID NO:8 
SEQ ID NO: 106 


AA 1-73 of 

SEQ ID N0:8 
SEQ ID NO: 107 


AA 74-96 of 
SEQ ID NO:8 
SEQ ID NO: 108 


AA 97-208 of 
SEQ ID NO: 8 
SEQ ID NO: 109 




MANGO 347 
human 


AA 1-35 of 

SEQ ID N0:1 1 
SEQ ID NO:110 


AA 36-138 of 
SEQ IDNO: 1 1 
SEQ ID NO; 1 11 








15 


TANGO 272 
human 


A A 1-20 of 

SEQ ID NO: 14 
SBQIDN0:112 


AA 21-1050 of 

SEQ ID NO: 14 
SEQ ID NO: 113 


AA 21-767 of 

SEQ ID NO: 14 
SEQ ID NO: 11 4 


A A 768-791 of 
SEQ ID NO: 14 
SEQ ID NO:115 


AA 792-1050 of 
SEQ ID NO: 1 4 
SEQ ID NO: 116 




TANGO 272 
mouse 




AA I -497 of 
SEQ ID NO: 17 

SEQ ID NO: 11 7 


AA 1-216 of 
SEQ ID NO: 17 
SEQ ID NO: 118 


AA 2 17-240 of 
SEQ ID NO: 17 

SEQ ID NO: 119 


AA 241-497 of 
SEQ ID NO: 17 

SEQ ID NO: 120 


20 


TANGO 272 
rat 




AA 1-636 of 
SEQ ID NO:20 
SEQ ID NO: 121 


AA 1-500 of 
SEQ IDNO:20 
SEQIDNO:122 


AA 501-524 of 
SEQ ID NO:20 
SEQIDNO:123 


AA 525-636 of 
SEQ ID NO:20 
SEQ ID NO: 124 




TANGO 295 
human 


AA 1-28 of 
SEQIDN0:23 
SEQ ID NO: 125 


AA 29-156 of 
SEQ ID NO:23 
SEQ ID NO: 126 








25 


TANGO 354 
human 


AA 1-19 of 

SEQIDNO:26 

SEQIDNO:127 


AA 20-305 of 
SEQIDNO:26 
SEQ ID NO: 128 


A A 20-169 of 
SEQ ID NO:26 
SEQ ID NO: 129 


AA 170-193 of 
SEQ ID NO:26 
SEQ ID NO: 130 


AA 194-305 of 
SEQ lDNO:26 
SEQ ID NO: 131 



30 



35 



-55- 

BNSDOCID: <WO_01 00673A1J_> 



TABLE 2 continued 



5 



10 



15 



20 



Proteiia 


Signal 
Peptide 


Mature 
Protein 


Extracellular 
jLioinain 


Transmembrane 
Domain 


Cytoplasmic 
Domain 


TANGO 378 


AA 1-21 of 


AA 22-528 of 


A A 22-244 of 


AA 245-269 of 


AA 505-528 of 


human 


SEQ ID NO:29 


SEQ lU NO:29 


SEQ IDNO;29 


SEQ ID NO:29 


SEQ ID NO:29 




SEQ IDNO:l32 


SEQ ID NO:l33 


SEQ ID NO:134 


SEQ ID NO: 1 35 


SEQ ID NO: 142 










AA 287-306 of 












SEQ ID NO:29 












SEQ ID NO: 136 












AA 323-343 of 












SEQ ID NO:29 












SEQ ID NO: 137 












AA 358-376 of 












SBQIDNO:29 












SEQ ID NO: 138 












AA 414-438 of 












SEQ IDNO:29 












SEQ ID NO: 139 












AA 457-477 of 












SEQIDNO:29 












SEQ ID NO: 140 












AA 485-504 of 












SEQ ID NO:29 












SEQ ID NO: 141 





Various aspects of the invention are described in further detail in the following 
25 subsections 

I. Isolated Nucleic Acid Molecules 

One aspect of the invention pertains to isolated nucleic acid molecules that encode a 
polypeptide of the invention or a biologically active portion thereof, as well as nucleic acid 

30 molecules sufficient for use as hybridization probes to identify nucleic acid molecules 
encoding a polypeptide of the invention and fragments of such nucleic acid molecules 
suitable for use as PGR primers for the amplification or mutation of nucleic acid molecules. 
As used herem, the term "nucleic acid molecule" is intended to include DNA molecules 
(e.^., cDNA or genomic DNA) and RNA molecules mRNA) and analogs of the DNA 

35 or RNA generated using nucleotide analogs. The nucleic acid molecule can be single- 
stranded or double-stranded, but preferably is double-stranded DNA. 
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An "isolated" nucleic acid molecule is one which is separated from other nucleic 
acid molecules which are present in the natural source of the nucleic acid molecule. 
Preferably, an "isolated" nucleic acid molecule is free of sequences (preferably protein 
encoding sequences) which naturally flank the nucleic acid (i.e., sequences located at the 5* 
and 3* ends of the nucleic acid) in the genomic DNA of the organism from which the 
^ nucleic acid is derived. In other embodiments, the "isolated" nucleic acid is free of intron 
sequences. For example, in various embodiments, the isolated nucleic acid molecule can 
contain less than about 5 kB, 4 kB, 3 kB, 2 kB, 1 kB, 0.5 kB or 0.1 kB of nucleotide 
sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from 
which the nucleic acid is derived. Moreover, an "isolated" nucleic acid molecule, such as a 
cDNA molecule, can be substantially free of other cellular material, or culture medium 
when produced by recombinant techniques, or substantially free of chemical precursors or 
other chemicals when chemically synthesized. In one embodiment, the nucleic acid 
molecules of the invention comprise a contiguous open reading frame encoding a 
polypeptide of the invention. 

A nucleic acid molecule of the present invention, e.g., a nucleic acid iholccule 
having the nucleotide sequence of SEQ ID NOs: l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 
21, 22, 24, 25, 27, 28 or 30, or a complement thereof, can be isolated using standard 
molecular biology techniques and the sequence information provided herein. Using all or a 
portion of the nucleic acid sequences of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 

20 

19, 21, 22, 24, 25, 27, 28 or 30 as a hybridization probe, nucleic acid molecules of the 
invention can be isolated using standard hybridization and cloning techniques (e.g., as 
described in Sambrook et al., cds.. Molecular Cloning: A Laboratory Manual, 2nd 
ed.,1989, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY). 

. ■ A nucleic acid molecule of the invention can be amplified using cDNA, mRNA or 

genomic DNA as a template and appropriate oligonucleotide primers according to standard 
PGR amplification techniques. The nucleic acid so amplified can be cloned into m 
appropriate vector and characterized by DNA sequence analysis. Furthermore, 
oligonucleotides corresponding to all or a portion of a nucleic acid molecule of the 

30 

invention can be prepared by standard synthetic techniques, e,g.y using an automated DNA 
synthesizer. 

In another preferred embodiment, an isolated nucleic acid molecule of the invention 
comprises a nucleic acid molecule which is a complement of the nucleotide sequence of 
SEQ ID NOsrl, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a 
portion thereof A nucleic acid molecule which is complementary to a given nucleotide 
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sequence is one which is sufficiently complementary to the given nucleotide sequence that 
it can hybridize to the given nucleotide sequence thereby forming a stable duplex. 

Moreover, a nucleic acid molecule of the invention can comprise only a portion of a 
nucleic acid sequence encoding a full length polypeptide of the invention for example, a 
fragment which can be used as a probe or primer or a fragment encoding a biologically 
^ active portion of a polypeptide of the invention. The nucleotide sequence determined from 
the cloning one gene allows for the generation of probes and primers designed for use in 
identifying and/or cloning homologues in other cell types, eg., from other tissues, as well as 
homologues from other mammals. The probe/primer typically comprises substantially 
purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide 
sequence that hybridizes under stringent conditions to at least about 12, preferably about 25, 
more preferably about 50, 75, 100, 125, 150, 1 75, 200, 250, 300, 350 or 400 consecutive 
nucleotides of the sense or anti-sense sequence of SEQ ID NOsil, 3, 4, 6, 7, 9, 10, 12, 13, 
15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or of a naturally occurring mutant of SEQ ID 
NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30. 
^ ^ Probes based on the sequence of a nucleic acid molecule of the invention can be 

used to detect transcripts or genomic sequences encoding the same protein molecule 
encoded by a selected nucleic acid molecule. The probe comprises a label group attached 
thereto, e.g., a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor. 
Such probes can be used as part of a diagnostic test kit for identifying cells or tissues which 
" mis-express the protein, such as by measuring levels of a nucleic acid molecule encoding 
the protein in a sample of cells from a subject, e.g., detecting mRNA levels or determining 
whether a gene encoding the protein has been niutated or deleted. 

A nucleic acid fragment encoding a biologically active portion of a polypeptide of 
the invention can be prepared by isolating a portion of any of SEQ ID NOs:l, 3, 4, 6, 7, 9, 
10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, expressing the encoded portion of the 
polypeptide protein (e.g., by recombinant expression in vitro) and assessing the activity of 
the encoded portion of the polypeptide. 

The invention further encompasses nucleic acid molecules that differ from the 
nucleotide sequence of SEQ ID NOs:l, 3, 4, 6. 7, 9, 10, 12, 13, 15. 16, 18, 19, 21, 22, 24, 
25, 27, 28 or 30, due to degeneracy of the genetic code and thus encode the same protein as 
that encoded by the nucleotide sequence SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 
19, 21, 22, 24, 25, 27, 28 or 30. 

In addition to the nucleotide sequences of SEQ ED NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 
15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, it will be appreciated by those skilled in the art 
that DNA sequence polymorphisms that lead to changes in the amino acid sequence may 
exist within a population (e.g., the human population). Such genetic polymorphisms may 

- 58 - 



. BNSDOCID: <WO 0100e7aAl i > . ■ ■ . . • - ■ -f'-i.'. ...•^-->.-> ^*«w/_i>»sw.,wo.-.J..»..-* 



exist among individuals within a population due to natural allelic variation, An allele is one 
of a group of genes which occur alternatively at a given genetic locus. As used herein, the 
phrase ''allelic variant" refers to a nucleotide sequence which occurs at a given locus or to a 
polypeptide encoded by the nucleotide sequence. As used herein, the terms "gene" and 
"recombinant gene" refer to nucleic acid molecules comprising an open reading frame 
^ encoding a polypeptide of the invention. Such natural allelic variations can typically result 
in 1-5% variance in the nucleotide sequence of a given gene. Alternative alleles can be 
identified by sequencing the gene of interest in a number of different individuals. This can 
be readily carried out by using hybridization probes to identify the same genetic locus in a 
variety of individuals. Any and all such nucleotide variations and resulting amino acid 
polymorphisms or variations that are the result of natural allelic variation and that do not 
alter the functional activity are intended to be within the scope of the invention. 

Moreover, nucleic acid molecules encoding proteins of the invention from od:ier 
species (homologues), which have a nucleotide sequence which differs from that of the 
human protem described herein are intended to be within the scope of the invention. 
Nucleic acid molecules corresponding to natural allelic variants and homologues of a cDNA 
of the invention can be isolated based on their identity to the human nucleic acid molecule 
disclosed herein using the human cDNAs, or a portion thereof, as a hybridization probe 
according to standard hybridization techniques under stringent hybridization conditions. 
For example, a cDNA encoding a soluble form of a membrane-bound protein of the 

20 

mvention isolated based on its hybridization to a nucleic acid molecule encoding all or part 
of the membrane-bound form. Likewise, a cDNA encoding a membrane-bound form can 
be isolated based on its hybridization to a nucleic acid molecule encoding all or part of the 
soluble form. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 
25 invention is at least 300 (325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 800, 900, 
1000, or 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 
3800, 4000, or 4200) nucleotides in length and hybridizes under stringent conditions to the 
nucleic acid molecule comprising the nucleotide sequence, preferably the coding sequence, 
of SEQ ID NOs:l, 3, 4, 6, 7. 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a 

OA 

complement thereof. 

As used herein, the term "hybridizes under stringent conditions" is intended to 
describe conditions for hybridization and washing under which nucleotide sequences at 
least 60% (65%, 70%, preferably 75%) identical to each other typically remain hybridized 
to each other. Such stringent conditions are known to those skilled in the art and can be 
found in Current Protocols in Molecular Biology, 1 989, John Wiley & Sons, NY, sections 
6.3.1-6.3.6. A preferred, non-limiting example of stringent hybridization conditions are 
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hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45 followed by one or 
more washes in 0.2 X SSC, 0. 1 % SDS at 50-65 C. Preferably, an isolated nucleic acid 
molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ 
rDNOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19,21,22, 24, 25,27, 28 or 30, or a 
complement thereof, corresponds to a naturally-occurring nucleic acid molecule. As used 

^ herein, a "naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule 
having a nucleotide sequence that occurs in nature (e.g.^ encodes a natural protein). 

In addition to naturally-occurring allelic variants of a nucleic acid molecule of the 
invention sequence that may exist in the population, the skilled artisan will further 
appreciate that changes can be introduced by mutation thereby leading to changes in the 
amino acid sequence of the encoded protein, without altering the biological activity of the 
protein. For example, one can make nucleotide substitutions leading to amino acid 
substitutions at "non-essential" amino acid residues. A "non-essential" amino acid residue 
is a residue that can be altered from the wild-type sequence without altering the biological 
activity, whereas an "essential" amino acid residue is required for biological activity. For 

^ ^ example, amino acid residues that are not conserved or only semi-conserved among 
homologues of various species may be non-essential for activity and thus would be likely 
targets for alteration. Alternatively, amino acid residues that are conserved among the 
homologues of various species (eg., murine and human) may be essential for activity and 
thus would not be likely targets for alteration. 

7.0 

Accordingly, another aspect of the invention pertains to nucleic acid molecules 
encoding a polypeptide of the invention that contain changes in amino acid residues that are 
not essential for activity. Such polypeptides differ in amino acid sequence from SEQ ID 
NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, yet retain biological activity. In one embodiment, 
the isolated nucleic acid molecule includes a nucleotide sequence encoding a protein that 
includes an amino acid sequence that is at least about 45% identical, 65%, 75%o, 85%, 95%, 
or 98% identical to the amino acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, 
or 29. 

An isolated nucleic acid molecule encoding a variant protein can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the nucleotide 
sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, such that one or more amino 
acid substitutions, additions or deletions are introduced into the encoded protein. Mutations 
can be introduced by standard techniques, such as site-directed mutagenesis and PCR- 
mediated mutagenesis. Briefly, PGR primers are designed that delete the trinucleotide 
codon of the amino acid to be changed and replace it with the trinucleotide codon of the 
amino acid to be included. This primer is used in the PGR amplification of DNA encoding 
die protein of interest. This fragment is then isolated and inserted into the full length cDNA 
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encoding the protein of interest and expressed recombinantly. The resulting protein now 
includes the amino acid replacement. 

Preferably, conservative amino acid substitutions are made at one or more predicted 
non-essential amino acid residues. Conservative replacements are those that take place 
within a family of amino acids that are related in their side chains. Genetically encoded 

^ amino acids are can be divided into four families: (1) acidic « aspartate, glutamate; (2) basic 
- lysine, arginine, histidine; (3) nonpolar = alanine, valine, leucine, isoleucine, proline, 
phenylalanine, methionine, tryptophan; and (4) uncharged polar = glycine, asparagine, 
glutamine, cysteine, serine, threonine, tyrosine. In similar fashion, the amino acid repertoire 
can be grouped as (1) acidic == aspartate, glutamate; (2) basic = lysine, arginine histidine, (3) 
aliphatic - glycine, alanine, valine, leucine, isoleucine, serine, threonine, with serine and 
threonine optionally be grouped separately as aliphatic-hydroxyl; (4) aromatic = 
phenylalanine, tyrosine, tryptophan; (5) amide - asparagine, glutamine; and (6) sulfur - 
containing = cysteine and methionine. (See, for example, Biochemistry, 4th ed., Ed. by L. 
Stryer, WH Freeman and Co.: 1995). 

^ ^ Alternatively, mutations can be introduced randomly along all or part of the coding 

sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 
biological activity to identify mutants that retain activity. Following mutagenesis, the 
encoded protein can be expressed recombinantly and the activity of the protein can be 
determined. 

20 

In a preferred embodiment, a mutant polypeptide that is a variant of a polypeptide of 
the invention can be assayed for: (1) the ability to form protein-protein interactions with 
proteins in a signaling pathway of the polypeptide of the invention; (2) the ability to bind a 
ligand of the polypeptide of the invention; or (3) the ability to bind to an intracellular target 
protein of the polypeptide of the invention. In yet another preferred embodiment, the 
mutant polypeptide can be assayed for the ability to modulate cellular proliferation, cellular 
migration or chemotaxis, or cellular differentiation. 

The present invention encompasses antisense nucleic acid molecules, i.e., molecules 
which are complementary to a sense nucleic acid encoding a polypeptide of the invention, 
e.g., complementary to the coding strand of a double-stranded cDNA molecule or 
complementary to an mRNA sequence. Accordingly, an antisense nucleic acid can 
hydrogen bond to a sense nucleic acid. The antisense nucleic acid can be complementary to 
an entire coding strand, or to only a portion thereof, e.g., all or part of the protein coding 
region (or open reading frame). An antisense nucleic acid molecule can be antisense to all 
or part of a non-coding region of the coding strand of a nucleotide sequence encoding a 
polypeptide of the invention. The non-coding regions ("5' and 3' untranslated regions") are 
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the 5' and 3* sequences which flank the coding region and are not translated into amino 
acids. 

An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 
45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis and enzymatic ligation reactions using procedures 
^ known in the art. For example, an antisense nucleic acid (e,g,, an antisense oligonucleotide) 
can be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides deisigned to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. Examples of 
modified nucleotides which can be used to generate the antisense nucleic acid include 5- 
fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4- 
acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2~ 
thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, P-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, l-methylinosine, 2,2-dimethylguanine, 

2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7- 
methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, P-D- 
mannosylqueosine, 5*-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6- 
isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2- 
thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5- 
oxyacetic acid methylester, uracil^S-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino- 

3- N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense 
nucleic acid can be produced biologically using an expression vector into which a nucleic 
acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted 
nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described 
further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a selected polypeptide of the invention to thereby inhibit 
expression, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

30 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
case of an antisense nucleic acid molecule which binds to DNA duplexes, through specific 
interactions in the major groove of the double helix. An example of a route of 
administration of antisense nucleic acid molecules of the invention includes direct injection 
at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target 
•^•^ selected cells and then administered systemically. For example, for systemic 

administration, antisense molecules can be modified such that they specifically bind to 
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receptors or antigens expressed on a selected cell surface, e,g,^ by linking the antisense 
nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or 
antigens. The antisense nucleic acid molecules can also be delivered to cells using the 
vectors described herein. To achieve sufficient intracellular concentrations of the antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under 
^ the control of a strong pol II or pol III promoter are preferred. 

An antisense nucleic acid molecule of the invention can be an a-anomeric nucleic 
acid molecule. An a-anomeric nucleic acid molecule forms specific double-stranded 
hybrids with complementary RNA in which, contrary to the usual p-units, the strands run 
parallel to each other (Gaultier et ah, \9il. Nucleic Acids Res, 15:6625-41). The antisense 
nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al.,1987, 
Nucleic Acids Res. 15:6131-48) or a chimeric RNA-DNA analogue (Inoue et al, 1987, 
FEBSLetL 215:327-30). 

The invention also encompasses ribozymes. Ribozymes are catalytic RNA 
molecules with ribonuclease activity which are capable of cleaving a single-stranded nucleic 
acid, such as an mRNA, to which they have a complementary region. Thus, ribozjmies 
(e.g., hammerhead ribozymes; described in Haselhoff and Gerlach, 1988, Nature 334:585- 
91) can be used to catalytically cleave mRNA transcripts to thereby inhibit translation of the 
protein encoded by the mRNA. A ribozyme having specificity for a nucleic acid molecule 
encoding a polypeptide of the invention can be designed based upon the nucleotide 
sequence of a cDNA disclosed herein. For example, a derivative of a Tetrahymena L-19 
IVS RNA can be constructed in which the nucleotide sequence of the active site is 
complementary to the nucleotide sequence to be cleaved in a Gech et al. U.S. Patent No, 
4,987,071; and Cech et al U.S. Patent No. 5,1 16,742. Alternatively, an mRNA encoding a 
polypeptide of the invention can be used to select a catalytic RNA having a specific 

25 

ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel and Szostak, 1993, 
toewce 261:141 1-8. 

The invention also encompasses nucleic acid molecules which form triple helical 
structures. For example, expression of a polypeptide of the invention can be inhibited by 
targeting nucleotide sequences complementary to the regulatory region of the gene 

30 

encoding the polypeptide (e.g., the promoter and/or enhancer) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally Helene, 1991, 
Anticancer Drug Des. 6(6):569-84; Helene, 1992, Ann. NY. Acad. Sci. 660:27-36; and 
Maher, 1992, Bioassays 14(12):807.15. 

In various embodiments, the nucleic acid molecules of the invention can be 
modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g.^ the 
stability, hybridization, or solubility of the molecule. For example, the deoxyribose 
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phosphate backbone of the nucleic acids can be modified to generate peptide nucleic acids 
{see Hyrup et al., 1996, Bioorganic & Medicinal Chemistry 4(1): 5-23). As used herein, the 
terms "peptide nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g,, DNA mimics, in 
which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and 
only the four natural nucleobases are retained. The neutral backbone of PNAs has been 
^ shown to allow for specific hybridization to DNA and RNA under conditions of low ionic 
strength. The synthesis of PNA oligomers can be performed using standard solid phase 
peptide synthesis protocols as described in Hymp et al, 1996, supra\ Perry-0*Keefe et al., 
1996, Proc. Natl Acad. Set, USA 93:14670-5. 

PNAs can be used in therapeutic and diagnostic applications. For example, PNAs 
can be used as antisense or antigene agents for sequence-specific modulation of gene 
expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs can also be used, in the analysis of single base pair mutations in a gene by, e.g^ 
PNA directed PGR clamping; as artificial restriction enzymes when used in combination 
with other enzymes, e.g., SI nucleases (Hyrup, 1996, supra)\ or as probes or primers for 
DNA sequence and hybridization (Hyrup, 1996, supra\ Perry-0*Keefe et al, 1996, Proc. 
NatL Acad. ScL USA 93:14670.675). 

In another embodiment, PNAs can be modified, e,g., to enhance their stability or 
cellular uptake, by attaching lipophilic or other helper groups to PNA, by the formation of 
PNA-DNA chimeras, or by the use of liposomes or other techniques of drug delivery known 
in the art. For example, PNA-DNA chimeras can be generated which may combine the 
advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the 
PNA portion would provide high binding affinity and specificity. PNA-DNA chimeras can 
be linked using linkers of appropriate lengths selected in terms of base stacking, number of 
bonds between the nucleobases, and orientation (Hyrup, 1996, supra). The synthesis of 
PNA-DNA chimeras can be performed as described in Hyrup (1996, supra) zxi& Finn et al. 
(1996, Nucleic Acids Res. 24(17):3357-63); For example, a DNA chain can be synthesized 
on a solid support using standard phosphoramidite coupling chemistry and modified 
nucleoside analogs. Compounds such as 5*-(4-methoxytrityl)amino-5'-deoxy-thymidine 

30 

phosphoramidite can be used as a link between the PNA and the 5' end of DNA (Mag et al., 
1 989, Nucleic Acids Res. 1 7:5973-88). PNA monomers are then coupled in a stepwise 
manner to produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment 
(Finn et al, 1996, Nucleic Acids Res. 24(17):3357-63). Alternatively, chimeric molecules 
can be synthesized with a 5* DNA segnient and a 3' PNA segment (Peterser et al., 1975, 
Bioorganic Med. Chem. Lett. 5:1119-1 124). 
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In other embodiments, the oligonucleotide may include other appended groups such 
as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport 
across the cell membrane {see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. USA 
86:6553-6; Lemaitre et al., 1987, Proc. Natl Acad. Sci. USA 84:648-52; PCT Publication 
No. WO 88/09810) or the blood-brain barrier {see, e.g., PCT Publication No. WO 89/10134). 
In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents 
{see, e,g„ Rrol et aL, 1988, Bio/Techniques 6:958-76) or intercalating agents (see, e.g., Zon, 
1988, Pharm. Res. 5:539-49). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g., a peptide, hybridization triggered cross-linking agent, transport 
agent, hybridization-triggered cleavage agent, etc. 



IL Isolated Proteins and Antibodies 

One aspect of the invention pertains to isolated proteins, and biologically active 
portions thereof, as well as polypeptide fragments suitable for use as immunogens to raise 
antibodies directed against a polypeptide of the invention. In one embodiment, the native 
polypeptide can be isolated from cells or tissue sources by an appropriate purification 
scheme using standard protein purification techniques. In another embodiment, 
polj/peptides of the invention are produced by recombinant DNA techniques. Alternative to 
recombinant expression, a polypeptide of the invention can be synthesized chemically using 
standard peptide synthesis techniques. 

An "isolated" or "purified" protein or biologically active portion thereof is 
substantially fi:ee of cellular material or other contaminating proteins fi-om the cell or tissue 
source from which the protein is derived, or substantially ft-ee of chemical precursors or 
other chemicals when chemically synthesized. The language "substantially free of cellular 
material" includes preparations of protein in which the protein is separated from cellular 
components of the cells from which it is isolated or recombinantly produced. Thus, protein 
that is substantially free of cellular material includes preparations of protein having less 
than about 30%, 20%, 10%, or 5% (by dry weight) of heterologous protein (also referred to 
herein as a "contaminating protein"). When the protein or biologically active portion 
thereof is recombinantly produced, it is also preferably substantially free of culture medium, 

-If) 

I.e., culture medium represents less than about 20%, 10%, or 5% of the volume of the 
protein preparation. When the protein is produced by chemical synthesis, it is preferably 
substantially free of chemical precursors or other chemicals, i.e., it is separated from 
chemical precursors or other chemicals which are involved in the synthesis of the protein. 
Accordingly such preparations of the protein have less than about 30%, 20%, 10%, 5% (by 
^ dry weight) of chemical precursors or compounds other than the polypeptide of interest. 
The term "pure" or ^'isolated** as used herein preferably has the same numerical limits as 
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"purified" or "isolated*' immediately above. "Isolated" and "purified" do not encompass 
either natural materials in their native state or natural materials that have been separated into 
components (e.g., in an acrylamide gel) but not obtained either as pure {e.g., lacking 
contaminating proteins, or chromatography reagents such' as denaturing agents and 
polymers, e.g., acrylamide or agarose) substances or solutions. In preferred embodiments, 

^ purified or isolated preparations will lack any contaminating proteins from the same animal 
from which the protein is normally produced, as can be accomplished by recombinant 
expression of, for example, a human protein in a non-human cell 

Biologically active portions of a polypeptide of the invention include polj^eptides 
comprising amino acid sequences sufficiently identical to or derived from the amino acid 

^ ^ sequence of the protein (e.g. , the amino acid sequence shown in any of SEQ ID NOs:2, 5, 8, 
11, 14, or 17), which include fewer amino acids than the full length protein, and exhibit at 
least one activity of the corresponding full-length protein. Typically, biologically active 
portions comprise a domain or motif with at least one activity of the corresponding protein. 
A biologically active portion of a protein of the invention can be a polypeptide which is, for 
example, 10, 25, 50, 100 or more amino acids in length. Moreover, other biologically 
active portions, in which other regions of the protein are deleted, can be prepared by 
recombinant techniques and evaluated for one or more of the functional activities of the 
native form of a polypeptide of the invention. 

Preferred polypeptides have the amino acid sequence of SEQ ID N0s:2, 5, 8, 11, 14^ 
17, 20, 23, 26, or 29. Other useful proteins are substantially identical (e.g., at least about 
45%, preferably 55%, 65%, 75%, 85%, 95%, or 99%) to any of SEQ ID N0s:2, 5, 8, 1 1, 14, 
17, 20, 23, 26, or 29 and retain the functional activity of the protein of the corresponding 
naturally-occurring protein yet differ in amino acid sequence due to natural allelic variation 
or mutagenesis. 

To determine the percent identity of two amino acid sequences or of two nucleic 
acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be 
introduced in the sequence of a first amino acid or nucleic acid sequence for optimal 
alignment with a second amino or nucleic acid sequence). The amino acid residues or 
nucleotides at corresponding amino acid positions or nucleotide positions are then 
compared. When a position in the first sequence is occupied by the same amino acid 
residue or nucleotide as the corresponding position in the second sequence, then the 
molecules are identical at that position. The percent identity between the two sequences is a 
function of the number of identical positions shared by the sequences (i.e., % identity # of 
identical positions/total # of positions (e.g., overlapping positions) x 100). In one 
embodiment the two sequences are the same length. 
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The determination of percent identity between two sequences can be accomplished 
using a mathematical algorithm. A preferred, non-limiting example of a mathematical 
algorithm utilized for the comparison of two sequences is the algorithm of Karlin and 
Altschul (1990, Proc. Natl, Acad. ScL USA 87:2264-8), modified as in Karlin and Altschul 
(1993, Proc. Natl Acad. ScL USA 90:5873-7). Such an algorithm is incorporated into the 
NBLAST and XBLAST programs of Ahschul et aL (1990, J. Mol Biol 215:403-10). 
BLAST nucleotide searches can be performed with the NBLAST program, score - 100, 
wordlength = 12 to obtain nucleotide sequences homologous to a nucleic acid molecules of 
the invention. BLAST protein searches can be performed with the XBLAST program, 
score = 50, wordlength = 3 to obtain amino acid sequences homologous to a protein 
molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped 
BLAST can be utilized as described in Altschul et al. (1997, Nucleic Acids Res. 25:3389- 
402), Alternatively, PSI-B last can be used to perform an iterated search which detects 
distant relationships between molecules (Id.). When utilizing BLAST, Gapped BLAST, 
and PSI-Blast programs, the defauU parameters of the respective programs (e.g., XBLAST 
and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov. Another preferred, non- 
limiting example of a mathematical algorithm utilized for the comparison of sequences is 
the algorithm of Myers and Miller ( 1 988, CABIOS 4: 1 1 -7). Such an algorithm is 
incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence 
alignment software package. When utilizing the ALIGN program for comparing amino 
acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap 
penalty of 4 can be used. 

The percent identity between two sequences can be determined using techniques 
similar to those described above, with or without allowing gaps. In calculating percent 
identity, typically exact matches are counted. 

The invention also provides chimeric or fusion proteins. As used herein, a chimeric 
protein" or "fusion protein" comprises all or part (preferably biologically active) of a 
polypeptide of the invention operably linked to a heterologous polypeptide (i.e., a 
polypeptide other than the same polypeptide of the invention). Within the fusion protein, 
the term "operably linked" is intended to indicate that the polypeptide of the invention and 
the heterologous polypeptide are fused in-frame to each other. The heterologous 
polypeptide can be fused to the N-terminus or C-terminus of the polypeptide of the 
invention. 

One useful fusion protein is a GST fusion protein in which the polypeptide of the 
invention is fused to the C-terminus of GST sequences. Such fusion proteins can facilitate 
the purification of a recombinant polypeptide of the invention. 
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In another embodiment, the fusion protein contains a heterologous signal peptide at 
its N-terminus. For example, the native signal peptide of a polypeptide of the invention can 
be removed and replaced with a signal peptide from another protein. For example, the gp67 
secretory sequence of the baculovirus envelope protein can be used as a heterologous signal 
peptide {Current Protocols in Molecular Biology, 1992, Ausubel et al., eds., John Wiley & 
Sons). Other examples of eukaryotic heterologous signal peptides include the secretory 
sequences of melittin and himian placental alkaline phosphatase (Stratagene; La Jolla, 
California). In yet another example, useful prokaryotic heterologous signal peptides include 
the phoA secretory signal (Sambrook et al., supra) and the protein A secretory signal 
(Pharmacia Biotech; Piscataway, New Jersey). 

In yet another embodiment, the fusion protein is an immunoglobulin fusion protein 
in which all or part of a polypeptide of the invention is fused to sequences derived from a 
member of the immunoglobulin protein family* The immunoglobulin fusion proteins of the 
invention can be incorporated into pharmaceutical compositions and administered to a 
subject to inhibit an interaction between a ligand (soluble or membrane-bound) and a 
protein on the surface of a cell (receptor), to thereby suppress signal transduction in vivo. 
The immunoglobulin fusion protein can be used to affect the bioavailability of a cognate 
ligand of a polypeptide of the invention. Inhibition of ligand/receptor interaction may be 
useful therapeutically, both for treating proliferative and differentiative disorders and for 
modulating (e.g., promoting or inhibiting) cell survival Moreover, the immunoglobulin 
fusion proteins of the invention can be used as immunogens to produce antibodies directed 
against a polypeptide of the invention in a subject, to purify ligands and in screening assays 
to identify molecules which inhibit the interaction of receptors with ligands. 

Chimeric and fusion proteins of the invention can be produced by standard 
recombinant DNA techniques. In another embodiment, the fusion gene can be synthesized 
by conventional techniques including automated DNA synthesizers. Alternatively, PGR 
amplification of gene fragments can be carried out using anchor primers which give rise to 
complementary overhangs between two consecutive gene fragments which can 
subsequently be annealed and reamplified to generate a chimeric gene sequence {see, e.g., 
Ausubel et al„ supra). Moreover, many expression vectors are commercially available that 
already encode a fusion moiety (eg., a GST polypeptide). A nucleic acid encoding a 
polypeptide of the invention can be cloned into such an expression vector such that the 
fusion moiety is linked in-frame to the polypeptide of the invention. 

A signal peptide of a polypeptide of the invention (SEQ ID NOstlOl, 1 10, 1 12» 125» 
127, or 132) can be used to facilitate secretion and isolation of the secreted protein or other 
proteins of interest* Signal peptides are typically characterized by a core of hydrophobic 
amino acids which are generally cleaved from the mature protein during secretion in one or 
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more cleavage events. Such signal peptides contain processing sites that allow cleavage of 
the signal peptide from the mature proteins as they pass through the secretory pathway. 
Thus, the invention pertains to the described polypeptides having a signal peptide, as well 
as to the signal peptide itself and to the polypeptide in the absence of the signal peptide (i.e., 
the cleavage products). In one embodiment, a nucleic acid sequence encoding a signal 
peptide of the invention can be operably linked in an expression vector to a protein of 
. interest, such as a protein which is ordinarily not secreted or is otherwise difficult to isolate. 
The signal peptide directs secretion of the protein, such as from a eukaryotic host into which 
the expression vector is transformed, and the signal peptide is subsequently or concurrently 
cleaved. The protein can then be readiily purified from the extracellular medium by art 
recognized methods. Alternatively, the signal peptide can be linked to the protein of 
interest using a sequence which facilitates purification, such as with a GST domain. 

In another embodiment, the signal peptides of the present invention can be used to 
identify regulatory sequences, eg,, promoters, enhancers, repressors. Since signal peptides 
are the most amino-terminal sequences of a peptide, it is expected that the nucleic acids 
which flank the signal peptide on its amino-temiinal side will be regulatory sequences 
which affect transcription. Thus, a nucleotide sequence which encodes all or a portion of a 
signal peptide can be used as a probe to identify and isolate signal peptides and their 
flanking regions, and these flanking regions can be studied to identify regulatory elements 
therein, 

The present invention also pertains to variants of the polypeptides of the invention. 
Such variants have an altered amino acid sequence which can function as either agonists 
(mimetics) or as antagonists. Variants can be generated by mutagenesis, e.g., discrete point 
mutation or truncation. An agonist can retain substantially the same, or a subset, of the 
biological activities of the naturally occurring form of the protein. An antagonist of a 
protein can inhibit one or more of the activities of the naturally occurring form of the 
protein by, for example, competitively binding to a downstream or upstream member of a 
cellular signaling cascade which includes the protein of interest. Thus, specific biological 
effects can be elicited by treatment with a variant of limited function. Treatment of a 
subject with a variant having a subset of the biological activities of the naturally occurring 
form of the protein can have fewer side effects in a subject relative to treatment with the 
naturally occurring form of the protein. 

Modification of the structure of the subject polypeptides can be for such purposes as 
enhancing therapeutic or prophylactic efficacy, stability (e.g., ex vivo shelf life and 
resistance to proteolytic degradation in vzvo), or post-translational modifications (e.g.., to 
alter phosphorylation pattern of protein). Such modified peptides, when designed to retain 
at least one activity of the naturally-occurring form of the protein, or to produce specific 
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antagonists thereof, are considered functional equivalents of the polypeptides described in 
more detail herein. Such modified peptides can be produced, for instance, by amino acid 
substitution, deletion, or addition. 

For example, it is reasonable to expect that an isolated replacement of a leucine with 
an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar 
^ replacement of an amino acid with a structurally related amino acid (Le. isosteric and/or 
isoelectric mutations) will not have a major effect on the biological activity of the resulting 
molecule. 

Whether a change in the amino acid sequence of a peptide results in a functional 
homolog (e.g,, functional in the sense that the resulting polypeptide mimics or antagonizes 
the wild-type form) can be readily determined by assessing the ability of the variant peptide 
to produce a response in cells in a fashion similar to the wild-type protein, or competitively 
inhibit such a response. Polypeptides in which more than one replacement has taken place 
can readily be tested in the same manner. 

Variants of a protein of the invention which function as either agonists (mimetics) or 
as antagonists can be identified by screening combinatorial libraries of mutants, e.g.^ 
truncation mutants, of the protein of the invention for agonist or antagonist activity. In one 
embodiment, a variegated library of variants is generated by combinatorial mutagenesis at 
the nucleic acid level and is encoded by a variegated gene library. A variegated library of 
variants can be produced by, for example, enzymatically ligating a mixture of synthetic 
oligonucleotides into gene sequences such that a degenerate set of potential protein 
sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion 
proteins (e.g., for phage display). There are a variety of methods which can be used to 
produce libraries of potential variants of the pol}TDeptides of the invention from a degenerate 
oligonucleotide sequence. Methods for synthesizing degenerate oligonucleotides are known 
in the art (see, e.g., Narang, 1983, Tetrahedron 39:3; Itakura et ah, 1984, Annu. Rev. 
Biochem. 53:323; Itakura et al, 1984, Science 198:1056; Ike et aL, 1983, Nucleic Acid 
Res A\ All). 

In addition, libraries of fragments of the coding sequence of a polypeptide of the 
invention can be used to generate a variegated population of polypeptides for screening and 
subsequent selection of variants. For example, a library of coding sequence fragments can 
be generated by treating a double stranded PCR fragment of the coding sequence of interest 
with a nuclease under conditions wherein nicking occurs only about once per molecule, 
denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA 
which can include sense/antisense pairs from different nicked products, removing single 
stranded portions from reformed duplexes by treatment with SI nuclease, and ligating the 
resulting fragment library into an expression vector. By this method, an expression library 

-70- 



1 1. BNSDOCID: <WO_0106e73A1J_> 



can be derived which encodes N-terminal and internal fragments of various sizes of the 
protein of interest. 

Several techniques are known in the art for screening gene products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. The most widely used techniques, which are amenable 
^ to high through-put analysis, for screening large gene libraries typically include cloning the 
gene library into replicable expression vectors, transforming appropriate cells with the 
resulting library of vectors, and expressing the combinatorial genes imder conditions in 
which detection of a desired activity facilitates isolation of the vector encoding the gene 
whose product was detected. Recursive ensemble mutagenesis (REM), a technique which 
enhances the frequency of functional mutants in the libraries, can be used in combination 
with the screening assays to identify variants of a protein of the invention (Arkin and 
Yourvan, 1992, Proc. NatL Acad, ScL USA 59:7811-5; Delgrave et al, 1993, Protein 
Engineering 6(3y327''3iy 

An isolated polypeptide of the invention, or a fragment thereof, can be used as ah 
immunogen to generate antibodies using standard techniques for polyclonal and monoclonal 
antibody preparation. The flill-length polypeptide or protein can be used or, alternatively, 
the invention provides antigenic peptide fragments for use as immunogens. The antigenic 
peptide of a protein of the invention comprises at least 8 (preferably 10, 15, 20, or 30) 
amino acid residues of the amino acid sequence of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 
26, or 29, and encompasses an epitope of the protein such that an antibody raised agamst the 
peptide forms a specific immune complex with the protein. 

Preferred epitopes encompassed by the antigenic peptide are regions that are located 
on the surface of the protein, e.g., hydrophilic regions. Hydropathy plots or similar analyses 
can be used to identify hydrophilic regions. 

An immunogen typically is used to prepare antibodies by immunizing a suitable 
subject, (e.g., rabbit, goat, mouse or other mammal). An appropriate immunogenic 
preparation can contain, for example, recombinantly expressed or chemically synthesized 
polypeptide. The preparation can further include an adjuvant, such as Freund*s complete or 
incomplete adjuvant, or similar immunostimulatory agent. 

on 

Accordingly, another aspect of the invention pertains to antibodies directed agamst 
a polypeptide of the invention. The term "antibody" as used herein refers to 
immunoglobulin molecules and immunologically active portions of immunoglobulin 
molecules, i.e., molecules that contain an antigen binding site which specifically binds an 
antigen, such as a polypeptide of the invention, e,g.f an epitope of a polypeptide of the 
invention. A molecule which specifically binds to a given polypeptide of the invention is a 
molecule which binds the polypeptide, but does not substantially bind other molecules in a 
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sample^ e.g., a biological sample, which naturally contains the polypeptide. Examples of 
immunologically active portions of immunoglobulin molecules include F(ab) and F(ab')2 
fragments which can be generated by treating the antibody with an enzyme such as pepsin. 
The invention provides polyclonal and monoclonal antibodies. The term "monoclonal 
antibody" or "monoclonal antibody composition", as used herein, refers to a population of 

^ antibody molecules that contain only one species of an antigen binding site capable of 
immunoreacting with a particular epitope* 

Polyclonal antibodies can be prepared as described above by immunizing a suitable 
subject with a polypeptide of the invention as an immunogen. Preferred polyclonal 
antibody compositions are ones that have been selected for antibodies directed against a 
polypeptide or polypeptides of the invention. Particularly preferred polyclonal antibody 
preparations are ones that contain only antibodies directed against a polypeptide or 
polypeptides of the invention. Particularly preferred immunogen compositions are those 
that contain no other human proteins such as, for example, immunogen compositions made 
using a non-human host cell for recombinant expression of a polypeptide of the invention. 

^ ^ In such a manner, the only himian epitope or epitopes recognized by the resulting antibody 
compositions raised against this immunogen will be present as part of a polypeptide or 
polypeptides of the invention. 

The antibody titer in the immunized subject can be monitored over time by standard 
techniques, such as with an enzyme linked immunosorbent assay (ELIS A) using 
immobilized polypeptide. If desired, the antibody molecules can be isolated from the 
mammal (e.g., from the blood) and further purified by well-known techniques, such as 
protein A chromatography to obtain the IgG fraction. Alternatively, antibodies specific for 
a protein or polypeptide of the invention can be selected for (e.g., partially purified) or 
purified by, e,g.y affinity chromatography. For example, a recombinantly expressed and 
purified (or partially purified) protein of the invention is produced as described herein, and 
covalently or non-covalently coupled to a solid support such as, for example, a 
chromatography column. The column can then be used to affinity purify antibodies specific 
for the proteins of the invention from a sample containing antibodies directed against a large 
number of different epitopes, thereby generating a substantially purified antibody 
composition, /.e., one that is substantially free of contaminating antibodies. By a 
substantially purified antibody composition is meant, in this context, that the antibody 
sample contains at most only 30% (by dry weight) of contaminating antibodies directed 
against epitopes other than those on the desired protein or polypeptide of the invention, and 
preferably at most 20%, yet more preferably at most 10%, and most preferably at most 5% 
(by dry weight) of the sample is contaminating antibodies. A purified antibody composition 
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means that at least 99% of the antibodies in the composition are directed against the desired 
protein or polypeptide of the invention. 

At an appropriate time after immunization, e.g,, when the specific antibody titers are 
highest, antibody-producing cells can be obtained from the subject and used to prepare 
monoclonal antibodies by standard techniques, such as the hybridoma technique (Kohler 
andMilstein, 1975, Nature 256:495-7), the human B cell hybridoma technique (Kozbor et 
al., 1983, ImmundL Today 4:72), the EBV-hybridoma technique (Cole et ah, 1985, 
Monoclonal Antibodies and Cancer Therapy^ Alan R. Liss, Inc., pgs. 77-96) or trioma 
techniques* The technology for producing hybridomas is well known {see generally 
Current Protocols in Immunology, 1994, Coligan et aL,eds., John Wiley & Sons, Inc., New 
York, NY). Hybridoma cells producing a monoclonal antibody of the invention are 
detected by screening the hybridoma culture supematants for antibodies that hind the 
polypeptide of interest, e.g., using a standard ELISA assay. 

Alternative to preparing monoclonal antibody-secreting hybridomas, a monoclonal 
antibody directed against a polypeptide of the invention can be identified and isolated by 
screening a Recombinant combinatorial immunoglobulin library {e.g., an antibody phage 
display library) with the polypeptide of interest. Kits for generating and screening phage 
display libraries are commercially available {e.g., the Pharmacia Recombinant Phage 
Antibody System, Catalog No. 27-9400-01 ; and the Stratagene SurfZAPJ Phage Display Kit, 
Catalog No, 240612). Additionally, examples of methods and reagents particularly 
amenable for use in generating and screening antibody display library can be found in, for 
example, U,S. Patent No. 5,223,409; PCT Publication No. WO 92/18619; PCT Publication 
No. WO 91/17271; PCT Publication No. WO 92/20791; PCT Publication No. WO 
92/15679; PCT Publication No. WO 93/01288; PCT Publication No. WO 92/01047; PCT 
Publication No. WO 92/09690; PCT Publication No. WO 90/02809; Fuchs et al., 1991, 
Bio/Technology 9:\310-2; Hay et ah, 1992, Hum. Antibod. Hybridomas 3:81-5; Huse et al., 
1989, Science 246:1275-81; Griffiths et al., 1993, EMBOJ. 12:725-34. 

Additionally, recombinant antibodies, such as chimeric and humanized monoclonal 
antibodies, comprising both human and non-human portions, which can be made using 
standard recombinant DNA techniques, are within the scope of the invention. A chimeric 
antibody is a molecule in which different portions are derived from different animal species, 
such as those having a variable region derived from a murine mAb and a human 
immunoglobulin constant region. (See, ^.g., Cabilly et al., U.S. Patent No. 4,816,567; and 
Boss et al., U.S. Patent No. 4,816,397, which are incorporated herein by reference in their 
entirety.) Humanized antibodies are antibody molecules from non-human species having 
one or more complementarity determining regions (CDRs) from the non-human species and 
a framework region from a human immimoglobulin molecule. (See, e,g. Queen, U.S. 
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Patent No. 5,585,089, which is incorporated herein by reference in its entirety.) Such 
chimeric and humanized monoclonal antibodies can be produced by recombinant DNA 

techniques known in the art, for example using methods described in PCT Publication No. 
WO 87/02671; European Patent Application 184,187; European Patent Application 
171,496; European Patent Application 173,494; PCT Publication No. WO 86/01533; U.S. 
^ Patent No. 4,816,567; European Patent Application 125,023; Better et al., 1988, Science 
240:1041^3; Liu et aL, 1987, Proc. NatL Acad. ScL USA 84:3439-43; Liu et al., 1987, J. 
ImmunoL 139:3521-6; Sun et al., 1987, Proc, Natl Acad, Set USA 84:214-8; Nishimura et 
al., 1987, Cane, Res. 47:999-1005; Wood et al., \9i5,Nature 314:446-9; and Shaw et aL, 
1988, Natl Cancer Inst. 80:1553-9; Morrison, 1985, Science 229:1202-7; Oi et al., 1986, 
Bio/Techniques A\2\A\ U.S. Patent 5,225,539; Jones et al., 1986, Nature 321:522-5; 
Verhoeyan et ah, 1988, Science 239:1534; and Beidler et aL, 1988, J. Immunol 141:4053- 
60. 

Completely human antibodies are particularly desirable for therapeutic treatment of 
human patients. Such antibodies can be produced , for example, using transgenic mice 
which are incapable of expressing endogenous immunoglobulin heavy and light chains 
genes, but which can express human heavy and light chain genes. The transgenic mice are 
immunized in the normal fashion with a selected antigen, e.g., all or a portion of a 
polypeptide of the invention. Monoclonal antibodies directed against the antigen can be 
obtained using conventional hybridoma technology. The human immunoglobulin 
transgenes harbored by the transgenic mice rearrange during B cell differentiation, and 
subsequently undergo class switching and somatic mutation. Thus, using such a technique, 
it is possible to produce therapeutically useful IgG, IgA and IgE antibodies. For an 
overview of this technology for producing human antibodies, see Lonberg and Huszar 
(1995, Int. Rev. Immunol 13:65-93). For a detailed discussion of this technology for 

25 

producmg human antibodies and human monoclonal antibodies and protocols for producing 
such antibodies, .yee, e,g,, U.S. Patent 5,625,126; U.S. Patent 5,633,425; U.S. Patent 
5,569,825; U.S. Patent 5,661,016; and U.S. Patent 5,545,806. In addition, companies such 
as Abgenix, Inc* (Freemont, CA), can be engaged to provide humm antibodies directed 
against a selected antigen using technology similar to that described above. 

30 

Completely human antibodies which recognize a selected epitope can be generated 
using a technique referred to as "guided selection." In this approach a selected non-human 
monoclonal antibody, e.g., a murine antibody, is used to guide the selection of a completely 
human antibody recognizing the same epitope. (Jespers et al., 1994, Bio/technology 
12:899-903). 

Further, an antibody (or fragment thereof) may be conjugated to a therapeutic 
moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or 
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cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, 
cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, 
tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy 
anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, 
glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or 
homologs thereof Therapeutic agents include, but are not limited to antimetabolites 
methotrexate, 6-mercaptopurine, 6-thioguanme> cytarabine, 5-fluorouracil decarbazine), 
alkylating agents (e.g., mechlorethamine, thiepa chlorambucil, melphalan, carmustine 
(BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, 
streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (I) (IDP) cispiatin), 
anthracyclines (e.g,, daunorubicin (formerly daunomycin) and doxorubicin), antibiotics 
(e.g.y dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin 
(AMC)), and anti-mitotic agents (e,g.^ vincristine and vinblastine). The conjugates of the 
invention can be used for modifying a given biological response, the drug moiety is not to 
be construed as limited to classical chemical therapeutic agents. For example, the drug 
moiety may be a protein or polypeptide possessing a desired biological activity. Such 
proteins may include, for example , a toxin such as abrin, ricin A, pseudomonas exotoxin, 
or diphtheria toxin; a protein such as tumor necrosis factor, a-interferon, P-interferon, nerve 
growth factor, platelet derived growth factor, tissue plasminogen activator; or biological 
response modifiers such as, for example, lymphokines, interleukin-1 ("IL-l")* interieukin-2 
("IL-2"), interleukin-6 CTL-6**), granulocyte macrophage colony stimulating factor ("GM- 
CSF"), granulocyte colony stimulating factor ("G-CSF"), granulocyte colony stimulating 
factor C*G-CSF"), or other growth factors. 

Techniques for conjugating such therapeutic moiety to antibodies are well known, 
see, e.g., Amon et aL, "Monoclonal Antibodies for Immunotargeting of Drugs in Cancer 
Therapy,*' in Monoclonal Antibodies and Cancer Therapy, 1985, Reisfeld et aL, eds., pgs. 
243-56; Hellstrom et al., "Antibodies For Drug Delivery," in Controlled Drug Delivery 
Edy 1987, Robinson et al., eds.; Thorpe, "Antibody Carriers of Cytotoxic Agents in Cancer 
Therapy: A Review," in Monoclonal Antibodies '84 Biological and Clinical Applications, 
1985, Pinchera et aL, eds, pgs. 475-506; "Analysis, Results, and Future Prospective of the 
Therapeutic Use of Radiolabeled Antibody in Cancer Therapy," in Monoclonal Anitbodies 
for Cancer Detection and Therapy, 1985, Baldwin et aL, eds., pgs. 303-16; and Thorpe et 
aL,1982, Immunol Rev,, 62: 1 19-58. Alternatively, an antibody can be conjugated to a 
second antibody to form an antibody heteroconjugate as described by Segal in U.S. Patent 
No. 4,676,980. 

An antibody directed against a polypeptide of the invention {e.g., monoclonal 
antibody) can be used to isolate the polypeptide by standard techniques, such as affinity 
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chromato^aphy or immunoprecipitation. Moreover, such an antibody can be used to detect 
the protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance 
and pattern of expression of the polypeptide. The antibodies can also be used diagnostically 
to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for 
example, determine the efficacy of a given treatment regimen. Detection can be facilitated 
by coupling the antibody to a detectable substance. Examples of detectable substances 
include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, 
bioluminescent materials, and radioactive materials. Examples of suitable enzymes include 
horseradish peroxidase, alkaline phosphatase, 8-gaIactosidase, or acetylcholinesterase; 
examples of suitable prosthetic group complexes include streptavidin/biotin and 
avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, 
fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride 
or phycoerythrin; an example of a luminescent material includes luminol; examples of 
bioluminescent materials include luciferase, luciferin, and aequorin, and examples of 
suitable radioactive material include ^^S or ^H. 

Further, an antibody (or fragment thereof) can be conjugated to a therapeutic moiety 
such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic 
agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin 
B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, 
vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, 
mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, 
lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic 
agents include, but are not limited to, antimetabolites (e.g. , methotrexate, 6- 
mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents 
(e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and 
lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, 
mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., 
daunombicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin 
(fomierly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti- 
mitotic agents (e.g. , vincristine and vinblastine). 

The conjugates of the invention can be used for modifying a given biological 
response, the drug moiety is not to be construed as limited to classical chemical therapeutic 
agents. For example, the drug moiety may be a protein or polypeptide possessing a desired 
biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, 
pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, a- 
interferon, p-interferon, nerve growth factor, platelet derived growth factor, tissue 
plasminogen activator; or, biological response modifiers such as, for example, lymphokines. 
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interleukin-1 ("IL-1"), interleukm-.2 ("IL-2"), interleukin-6 ("IL-6'*), granulocyte 
macrophage colony stimulating factor ("GM-CSF"), granulocyte colony stimulating factor 
("G-CSF"), or other growth factors. 

Techniques for conjugating such therapeutic moiety to antibodies are well known, 
see, e.g., Amon et al., "Monoclonal Antibodies For Immunotargeting Of Drugs In Cancer 
Therapy", in Monoclonal Antibodies And Cancer Therapy, 1985, Reisfeld et al. (eds.), pgs. 
243-56, Alan R. Liss, Inc.; Hellstrom et al., "Antibodies For Drug Delivery", in Controlled 
Drug Delivery (2nd Ed.), 1987, Robinson et al. (eds.), pgs. 623-53, Marcel Dekker, Inc.; 
Thorpe, "Antibody Carriers Of Cytotoxic Agents In Cancer Therapy: A Review", in 
Monoclonal Antibodies '84: Biological And Clinical Applications, 1985, Pinchera et al. 
(eds.), pgs. 475-506; "Analysis, Results, And Future Prospective Of The Therapeutic Use 
Of Radiolabeled Antibody In Cancer Therapy", in Monoclonal Antibodies For Cancer 
Detection And Therapy, 1985, Baldwin et al. (eds.), pgs. 303-16, Academic Press, and 
Thorpe et aL, "The Preparation And Cj^otoxic Properties Of Antibody-Toxin Conjugates", 
Immunol. Rev., 1 982, 62: 1 1 9-58. 

Alternatively, an antibody can be conjugated to a second antibody to form an 
antibody heteroconjugate as described by Segal in U.S. Patent No. 4,676,980. 
Accordingly, in one aspect, the invention provides substantially purified antibodies or 
fragment thereof, and human or non-human antibodies or fragments thereof, which 
antibodies or fragments specifically bind to a polypeptide comprising an amino acid 
sequence selected from the group consisting of: the amino acid sequence of any one of SEQ 
ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29; or an amino acid sequence encoded by the 
cDNA of a clone deposited as ATCC® Accession Number 207178, ATCC® Accession 
Number PTA-249, or ATCC® Accession Number PTA-250; a fragment of at least 15 amino 
acid residues of the amino acid sequence of any one of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 
23, 26, or 29; an amino acid sequence which is at least 95% identical to the amino acid 
sequence of any one of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, wherein the 
percent identity is determined using the ALIGN program of the GCG software package with 
a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4; and an 
amino acid sequence which is encoded by a nucleic acid molecule which hybridizes to the 
nucleic acid molecule consisting of any one of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 
16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited as ATCC® 
Accession Number 207 1 78, ATCC® Accession Number PTA-249, or ATCC® Accession 
Number PTA-250, or a complement thereof, under conditions of hybridization of 6X SSC at 
45^*0 and washing in 0.2 X SSC, 0.1% SDS at 65**C. In various embodiments, the 

Off 

substantially purified antibodies of the invention, or fragments thereof, can be human, non- 
human, chimeric and/or humanized antibodies. 
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In another aspect, the invention provides human or non-human antibodies or 
fragments thereof, which antibodies or fragments specifically bind to a polypeptide 
comprising an amino acid sequence selected from the group consisting of: the amino acid 
sequence of any one of SEQ ID N0s:2, 5, 8, 1 1,14, 17, 20, 23, 26, or 29, or an amino acid 
sequence encoded by the cDNA of a clone deposited as ATCC® Accession Number 2071 78, 

^ ATCC® Accession Number PTA-249, or ATCC® Accession Number PTA-250; a fragment 
of at least 15 amino acid residues of the amino acid sequence of any one of SEQ ID NOs: 2, 
5, 8, 11, 14, 17, 20, 23, 26, or 29, an amino acid sequence which is at least 95% identical to 
the amino acid sequence of any one of SEQ ID NOs: 2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, 
wherein the percent identity is determined using the ALIGN program of the GCG software 
package with a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty 
of 4; and an amino acid sequence which is encoded by a nucleic acid molecule which 
hybridizes to the nucleic acidmolecule consisting of any one of SEQ ID NOsil, 3, 4, 6, 7, 
9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited 
as ATCC® Accession Number 207178, ATCC® Accession Number PTA-249, or ATCC® 

^ ^ Accession Number PTA'-250, or a complement thereof, under conditions of hybridization of 
6X SSC at 45*C and washing in 0.2 X SSC, 0. 1 % SDS at 65^C. Such non-human 
antibodies can be goat, mouse, sheep, horse, chicken, rabbit, or rat antibodies. 
Alternatively, the non-human antibodies of the invention can be chimeric and/or humanized 
antibodies- In addition, the human or non-human antibodies of the invention can be 

20 

polyclonal antibodies or monoclonal antibodies. 

In still a further aspect, the invention provides monoclonal antibodies or fragments 
thereof, which antibodies or fragments specifically bind to a polypeptide comprising an 
amino acid sequence selected from the group consisting of: the amino acid sequence of any 
one of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, or an amino acid sequence 
encoded by the cDNA of a clone deposited as ATCC® Accession Number 207178, ATCC® 
Accession Number PTA-249, or ATCC® Accession Number PTA-250; a fragment of at 
least 15 amino acid residues of the amino acid sequence of any one of SEQ ID NOs:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, an amino acid sequence which is at least 95% identical to the 
amino acid sequence of any one of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, 
wherein the percent identity is determined using the ALIGN program of the GCG software 
package with a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty 
of 4; and an amino acid sequence which is encoded by a nucleic acid molecule which 
hybridizes to the nucleic acid molecule consisting of any one of SEQ ID NOs: 1, 3, 4, 6, 7, 
9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited 
as any of ATCC® Accession Number 2071 78, ATCC® Accession Number PTA-249, or 
ATCC® Accession Number PTA-250, or a complement thereof, under conditions of 
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hybridization of 6X SSC at 45°C and washing in 0.2 X SSC, 0. 1% SDS at 65°C. The 
monoclonal antibodies can be human, humanized, chimeric and/or non-human antibodies. 

The substantially purified antibodies or fragments thereof specifically bind to a 
signal peptide, a secreted sequence, an extracellular domain, a transmembrane or a 
cytoplasmic domain cytoplasmic membrane of a polypeptide of the invention. In a 

^ particularly preferred embodiment, the substantially purified antibodies or fragments 
thereof, the human or non-human antibodies or fragments thereof, and/or the monoclonal 
antibodies or fragments thereof, of the invention specifically bind to a secreted sequence or 
an extracellular domain of the amino acid sequence of SEQ ID NOs:103, 107, 1 14, 118, 
122, 129, or 134. Preferably^ the secreted sequence or extracellular domain to which the 

^ ^ antibody, or fragment thereof, binds comprises from about amino acids 25-374 of SEQ ID 
N0:5 (SEQ ID NO:103), from amino acids 1-73 of SEQ ID N0:8 (SEQ ID NO: 107), from 
amino acids 21-767 of SEQ ID NO:14 (SEQ ID N0:1 14), from amino acids 1-216 of SEQ 
ID N0:17 (SEQ ID NO:l 18), from amino acids 1-500 of SEQ ID NO:20 (SEQ ID NO:122) 
from amino acids 20-169 of SEQ ID NO:26 (SEQ ID NO: 129), and from amino acids 22- 
244ofSEQIDNO:29(SEQIDNO:134). 

Any of the antibodies of the invaation can be conjugated to a therapeutic moiety or 
to a detectable substance. Non-limiting examples of detectable substances that can be 
conjugated to the antibodies of the invention are an enzyme, a prosthetic group, a 
fluorescent material, a luminescent material, a bioluminescent material, and a radioactive 
matenaL 

The invention also provides a kit containing an antibody of the invention conjugated 
to a detectable substance, and instructions for use. Still another aspect of the invention is a 
pharmaceutical composition comprising an antibody of the invention and a 
pharmaceutically acceptable carrier. In preferred embodiments, the pharmaceutical 
composition contains an antibody of the invention, a therapeutic moiety, and a 
pharmaceutically acceptable carrier. 

Still another aspect of the invention is a method of making an antibody that 
specifically recognizes INTERCEPT 340, MANGO 003, MANGO 347. TANGO 272, 
TANGO 295, TANGO 354, and TANGO 378, the method comprising immunizing a 
mammal with a polypeptide. The polypeptide used as an immunogen comprises an amino 
acid sequence selected from the group consisting of: the amino acid sequence of any one of 
SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, or an amino acid sequence encoded by 
the cDNA of a clone deposited as ATCC® Accession Number 207178, ATCC® Accession 
Number PTA-249, or ATCC® Accession Number PTA-250; a fragment of at least 15 amino 
acid residues of the amino acid sequence of any one of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 
23, 26, or 29, an amino acid sequence which is at least 95% identical to the amino acid 
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sequence of any one of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, wherein the 
percent identity is determined using the ALIGN program of the GCG software package with 
a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4; and an 
amino acid sequence which is encoded by a nucleic acid molecule which hybridizes to the 
nucleic acid molecule consisting of any one of SEQ ID NOs: I, 3, 4, 6, 7, 9, 10, 12, 13, 15, 
16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited as ATCC® 
Accession Number 207178, ATCC® Accession Number PTA-249, or ATCC® Accession 
Number PTA-250, or a complement thereof, under conditions of hybridization of 6X SSC at 
45°C and washing in 0.2 X SSC, 0.1% SDS at 65°C. After immunization, a sample is 
collected fi-om the mammal that contains an antibody that specifically recognizes GPVI. 
Preferably, the polypeptide is recombinantly produced using a non-human host cell. 
Optionally, the antibodies can be further purified from the sample Using techniques well 
known to those of skill in the art. The method can further comprise producing a 
monoclonal antibody-producing cell from the cells of the mammal. Optionally, antibodies 
are collected from the antibody-producing cell. 



IIL Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression vectors, 
containing a nucleic acid encoding a polypeptide of the invention (or a portion thereof). As 
used herein, the term *' vector" refers to a nucleic acid molecule capable of transporting 
another nucleic acid to which it has been linked* One type of vector is a "plasmid", which 
refers to a circular double stranded DNA loop into which additional DNA segments can be 
ligated. Another type of vector is a viral vector, wherein additional DNA segments can be 
ligated into the viral genome. Certain vectors are capable of autonomous replication in a 
host cell into which they are introduced (e.^. , bacterial vectors having a bacterial origin of 
replication and episomal mammalian vectors). Other vectors (e.g., non-episomal 
mammalian vectors) are integrated into the genome of a host cell upon introduction into the 
host cell, and thereby are replicated along with the host genome* Moreover, certain vectors, 
expression vectors, are capable of directing the expression of genes to which they are 
operably linked. In general, expression vectors of utility in recombinant DNA techniques 
are often in the form of plasmids (vectors). However, the invention is intended to include 
such other forms of expression vectors, such as viral vectors (e.g., replication defective 
retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of the 
invention in a form suitable for expression of the nucleic acid in a host cell. This means 
that the recombinant expression vectors include one or more regulatory sequences, selected 
on the basis of the host cells to be used for expression, which is operably linked to the 
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nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably 
linked" is intended to mean that the nucleotide sequence of interest is linked to the 
regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence 
(e.g.y in an in vitro transcription/translation system or in a host cell when the vector is 
introduced into the host cell). The term "regulatory sequence" is intended to include 
^ promoters, enhancers and other expression control elements (e.g,, polyadenylation signals). 
Such regulatory sequences are described, for example, in Goeddel, Gene Expression 
Technology: Mie^Aoifemfi'nzywofogy, 1990, Academic Press, San Diego, CA. Regulatory 
sequences include those which direct constitutive expression of a nucleotide sequence in 
many types of host cell and those which direct expression of the nucleotide sequence only 
in certain host cells (eg., tissue-specific regulatory sequences). It will be appreciated by 
those skilled in the art that the design of the expression vector can depend on such factors as 
the choice of the host cell to be transformed, the level of expression of protein desired, etc. 
The expression vectors of the invention can be introduced into host cells to thereby produce 
proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as 
described herein. 

The recombinant expression vectors of the invention can be designed for expression 
of a polypeptide of the invention in prokaryotic (e.g., E, colt) or eukaryotic cells (e.g., insect 
cells (using baculovirus expression vectors), yeast cells or mammalian cells). Suitable host 
cells are discussed further in Goeddel, ^sMpra. Altematively, the recombinant expression 
vector can be transcribed and translated in vitro^ for example using T7 promoter regulatory 
sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in E. coli with 
vectors containing constitutive or inducible promoters directing the expression of either 
fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the recombinant protein. Such fusion 
vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) 
to increase the solubility of the recombinant protein; and 3) to aid in the purification of the 
recombinant protein by acting as a ligand in affinity purification. Often, in fiision 
expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion 
moiety and the recombinant protein to enable separation of the recombinant protein from 
the fusion moiety subsequent to purification of the fusion protein. Such enz3mies, and their 
cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical 
fiision expression vectors include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988, 
Gene 67:31-40), pMAL (New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia, 
Piscataway, NJ) which fuse glutathione S-transferase (GST), maltose E binding protein, or 
protein A, respectively, to the target recombinant protein. 
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Examples of suitable inducible non-fusion £. coli expression vectors include pTrc 
(Amann et al.,1988, Gene 69:301-15) and pET 1 Id (Studier et al. Gene Expression 
Technology: Methods in Enzymotogy, 1990, Academic Press, San Diego, CA pgs. 60-89). 
Target gene expression from the pTrc vector relies on host RNA polymerase transcription 
from a hybrid trp-lac fusion promoter. Target gene expression from the pET 1 Id vector 
^ relies on transcription from a T7 gnlO-lac fusion promoter mediated by a coexpressed viral 
RNA polymerase (T7 gnl). This viral polymerase is supplied by host strains BL21(DE3) or 
HMS174(DE3) from a resident prophage harboring a T7 gnl gene under the transcriptional 
control of the lacUV 5 promoter. 

One strategy to maximize recombinant protein expression in E. coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the 
recombinant protein (Gottesman, Gene Expression Technology: Methods in Enzymology^ 
1990, Academic Press, San Diego, CA pgs. 1 19-128). Another strategy is to alter the 
nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utilized in E. coli (Wada et 
^ ^ al, 1 992, Nucleic Acids Res, 20:2 11 1 -8). Such alteration of nucleic acid sequences of the 
invention can be carried out by standard DNA synthesis techniques. 

In another embodiment, the expression vector is a yeast expression vector. 
Examples of vectors for expression in yeast S. cerivisae include pYepSecl (Baldari et al., 
1987, EMBO J. 6:229-34), pMFa (Kurjan and Herskowitz, 1982, Cell 30:933-43), pJRY88 
(Schultz et al., 1987, Gene 54:113-23), pYES2 (Invitrogen Corporation, San Diego, CA), 
and pPicZ (Invitrogen Corp, San Diego, CA). 

Alternatively, the expression vector is a baculovirus expression vector. Baculovirus 
vectors available for expression of proteins in cultured insect cells (e.g., Sf 9 cells) include 
the pAc series (Smith et al., 1983, MoL Cell BioL 3:2156-65) and the pVL series (Lucklow 
and Summers, 1989, Virology 170:31-9). 

In yet another embodiment, a nucleic acid of the invention is expressed in 
mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDMS (Seed, 1987, Nature 329:840) and pMT2PC (Kaufinan 
et al, 1987, EMBO J. 6:187-95). When used in mammalian cells, the expression vector*s 
control functions are often provided by viral regulatory elements. For example, commonly 
used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian 
Virus 40- For other suitable expression systems for both prokaryotic and eukaryotic cells 
see chapters 16 and 17 of Sambrook et al., supra. 

In another embodiment, the recombinant mammalian expression vector is capable of 
directing expression of the nucleic acid preferentially in a particular cell type {e.g. , tissue- 
specific regulatory elements are used to express the nucleic acid). Tissue-specific 
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regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific 
promoters include the albumin promoter (liver-specific; Pinkert et al., 1987, Genes Dev. 
1 :268'77), lymphoid-specific promoters (Calame and Eaton, 1988, Adv. Immunol. 43:235- 
75), in particular promoters of T cell receptors (Winoto and Baltimore, 1989, EMBOJ, 
8:729-33) and immunoglobulins (Banerji et aL, 1983, Cell 33:729-40; Queen and 
Baltimore, 1983, Cell 33:741-8), neuron-specific promoters (e,g.^ the neurofilament 
promoter; Byrne and Ruddle, 1989, Proc. NatL Acad Set USA 86:5473-7), pancreas- 
specific promoters (Edlund et aL, 1985, Science 230:912-6% and mammary gland-specific 
promoters (eg., milk whey promoter; U»S. Patent No. 4,873,316 and European Application 
Publication No, 264,166). Developmentally-regulated promoters are also encompassed, for 
example the murine hox promoters (Kessel and Grass, 1990, Science 249:374-9) and the a- 
fetoprotein promoter (Campes and Tilghman, Genes Dev. 3:537-46), 

The invention further provides a recombinant expression vector comprising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientation. 
That is, the DNA molecule is operably linked to a regulatory sequence in a manner which 
allows for expression (by transcription of the DNA molecule) of an RNA molecule which is 
antisense to the mRNA encoding a polypeptide of the invention. Regulatory sequences 
operably linked to a nucleic acid cloned in the antisense orientation can be chosen which 
direct the continuous expression of the antisense RNA molecule in a variety of cell types, 
for instance viral promoters and/or enhancers, or regulatory sequences can be chosen which 
direct constitutive, tissue specific or cell type specific expression of antisense RNA. The 
antisense expression vector can be in the form of a recombinant plasmid, phagemid or 
attenuated virus in which antisense nucleic acids are produced under the control of a high 
efficiency regulatory region, the activity of which can be determined by the cell type into 
which the vector is introduced. For a discussion of the regulation of gene expression using 
antisense genes see Weintraub et al. (1985, Reviews - Trends in Genetics l(l):22-5). 

Another aspect of the invention pertains to host cells into which a recombinant 
expression vector of the invention has been introduced. The terms "host cell" and 
"recombinant host cell" are used interchangeably herein. It is understood that such terms 
refer not only to the particular subject cell but to the progeny or potential progeny of such a 
cell. Because certain modifications may occur in succeeding generations due to either 
mutation or environmental influences, such progeny may not, in fact, be identical to the 
parent cell, but are still included within the scope of the term as used herein. 

A host cell can be any prokaryotic (e,g„ E. coli) or eukaryotic cell (e.g., insect cells, 
yeast or mammalian cells). 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 
transformation or transfection techniques. As used herein, the terms "transformation" and 
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"transfection" are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid into a host cell, including calcium phosphate or calcium chloride co- 
precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation* 
Suitable methods for transforming or transfecting host cells can be found in Sambrook, et 
aL (supra), and other laboratory manuals. 
^ For stable transfection of mammalian cells, it is known that, depending upon the 

expression vector and transfection technique used, only a small fraction of cells may 
integrate the foreign DNA into their genome. In order to identify and select these 
integrants, a gene that encodes a selectable marker (e.g., for resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest. Preferred selectable 
markers include those which confer resistance to drugs, such as G418, hygromycin and 
methotrexate. Cells stably transfected with the introduced nucleic acid can be identified by 
drug selection (e.g., cells that have incorporated the selectable marker gene will survive, 
while the other cells die). 

In another embodiment, the expression characteristics of an endogenous (e,g.^ 
INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, 
and TANGO 378) nucleic acid within a cell, cell line or microorganism may be modified by 
inserting a DNA regulatory element heterologous to the endogenous gene of interest into 
the genome of a cell, stable cell line or cloned microorganism such that the inserted 
regulatory element is operatively linked with the endogenous gene (e.g., INTERCEPT 340, 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378) 
and controls, modulates or activates the endogenous gene. For example, endogenous 
INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, 
andTANGO 378 which are normally "transcriptionally silent", i.e., INTERCEPT 340, 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 
genes which are normally not expressed, or are expressed only at very low levels in a cell 
line or microorganism, may be activated by inserting a regulatory element which is capable 
of promoting the expression of a normally expressed gene product in that cell line or 
microorganism. Alternatively, transcriptionally silent, endogenous INTERCEPT 340, 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 

-an 

genes may be activated by msertion of a promiscuous regulatory element that works across 
cell types. 

A heterologous regulatory element may be inserted into a stable cell line or cloned 
microorganism, such that it is operatively linked with and activates expression of 
endogenous INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378 genes, using techniques, such as targeted homologous 
recombination, which are well known to those of skill in the art, and described e,g., in 
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Chappel, us. Patent No. 5,272,071; PCX publication No. wo 91/06667, published M 
16,1991. 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, 
can be used to produce a polypeptide of the invention. Accordingly, the invention further 
provides methods for producing a polypeptide of the invention using the host cells of the 
^ invention. In one embodiment, the method comprises culturing the host cell of invention 
(into which a recombinant expression vector encoding a polypeptide of the invention has 
been introduced) in a suitable medium such that the polypeptide is produced. In another 
embodiment, the method further comprises isolating the polypeptide from the medium or 
the host cell. 

The host cells of the invention can also be used to produce nonhuman transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte 
or an embryonic stem cell into vv^hich a sequences encoding a polypeptide of the invention 
have been introduced. Such host cells can then be used to create non-human transgenic 
animals in which exogenous sequences encoding a polypeptide of the invention have been 

1 c 

introduced into their genome or homologous recombinant animals in which endogenous 
encoding a polypeptide of the invention sequences have been altered. Such animals are 
useful for studying the function and/or activity of the polypeptide and for identifying and/or 
evaluating modulators of polypeptide activity. As used herein, a "transgenic animal" is a 
non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, 

OA 

in which one or more of the cells of the animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, 
amphibians, etc. A transgene is exogenous DNA which is integrated into the genome of a 
cell from which a transgenic animal develops and which remains in the genome of the 
mature animal, thereby directing the expression of an encoded gene product in one or more 
■ cell types or tissues of the transgenic animal. As used herein, an "homologous recombinant 
animal" is a non-human animal, preferably a mammal, more preferably a mouse, in which 
an endogenous gene has been altered by homologous recombination between the 
endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, 
an embryonic cell of the animal, prior to development of the animal. 

A transgenic animal of the mvention can be created by introducing nucleic acid 
encoding a polypeptide of the invention (or a homologue thereof) into the male pronuclei of 
a fertilized oocyte, e.g., by microinjection, retroviral infection, and allowing the oocj^e to 
develop in a pseudopregnant female foster animal. Intronic sequences and polyadenylation 
signals can also be included in the transgene to increase the efficiency of expression of the 
transgene. A tissue-specific regulatory sequence(s) can be operably linked to the transgene 
to direct expression of the polypeptide of the invention to particular cells. Methods for 
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generating transgenic animals via embryo manipulation and microinjection, particularly 
animals such as mice, have become conventional in the art and are described, for example, 
in U.S. Patent NOs. 4,736»866; 4,870,009; 4,873,191 and in Hogan {Manipulating the 
Mouse Embryo, 1986, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). 
Similar methods are used for production of other transgenic animals. A transgenic founder 
animal can be identified based upon the presence of the transgene in its genome and/or 
expression of mRNA encoding the transgene in tissues or cells of the animals. A transgenic 
foimder animal can then be used to breed additional animals carrying the transgene. 
Moreover, transgenic animals canying the transgene can further be bred to other transgenic 
animals carrying other transgenes. 

To create an homologous recombinant animal, a vector is prepared which contains at 
least a portion of a gene encoding a polypeptide of the invention into which a deletion, 
addition or substitution has been introduced to thereby alter, e.g., functionally disrupt, the 
gene. In a preferred embodiment, the vector is designed such that, upon homologous 
recombination, the endogenous gene is functionally disrupted (i.e., no longer encodes a 
functional protein; also referred to as a "knock out" vector). Alternatively, the vector can be 
designed such that, upon homologous recombination, the endogenous gene is tnutated or 
otherwise altered but still encodes functional protein (e.g., the upstream regulatory region 
can be altered to thereby alter the expression of the endogenous protein). In the 
homologous recombination vector, the altered portion of the gene is flanked at its 5* and 3* 
ends by additional nucleic acid of the gene to allow for homologous recombination to occur 
between the exogenous gene carried by the vector and an endogenous gene in an embryonic 
stem cell. The additional flanking nucleic acid sequences are of sufficient length for 
successful homologous recombination with the endogenous gene. Typically, several 
kilobases of flanking DNA (both at the 5* and 3' ends) are included in the vector {see, e,g., 
Thomas and Capecchi, 1987, Cell 51 :503 for a description of homologous recombination 
vectors). The vector is introduced into an embryonic stem cell line (^.g., by electroporation) 
and cells in which the introduced gene has homologously recombined with the endogenous 
gene are selected {see, e.g., Li et al., 1992, Cell 69:915). The selected cells are then injected 
into a blastocyst of an animal (e.g., a mouse) to form aggregation chimeras {see, e.g., 
Bradley in Teratocarcinomas and Embryonic Stem Cells: A Practical Approach, 1987, 
Robertson, ed., IRL, Oxford pgs, 1 13-52). A chimeric embryo can then be implanted into a 
suitable pseudopregnant female foster animal and the embryo brought to term. Progeny 
harboring the homologously recombined DNA in their germ cells can be used to breed 
animals in which all cells of the animal contain the homologously recombined DNA by 
germline transmission of the transgene. Methods for constructing homologous 
recombination vectors and homologous recombinant animals are described further in 
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Bradley, 1991, Current Opinion in Bio/Technology 2:823-9 and in PCT Publication NOs. 
WO 90/1 1354, WO 91/01 140, WO 92/0968 and WO 93/04169. 

In another embodiment, transgenic non-human animals can be produced which 
contain selected systems which allow for regulated expression of the transgene. One 
example of such a system is the cre/loxP recombinase system of bacteriophage PI. For a 
description of the cre/loxP recombinase system, see, e.g., Lakso et ah, 1992, Proc. Natl 
Acad. Sou USA 89:6232-6. Another example of a recombinase system is the FLP 
recombinase system of Saccharomyces cerevisiae (O'Gorman et al., 1991, Science^ 
251 : 135 1-5). If a cre/loxP recombinase system is used to regulate expression of the 
transgene, animals containing transgenes encoding both the Cre recombinase and a selected 
protein are required. Such animals can be provided through the construction of "double" 
transgenic animals, e.g,^ by mating two transgenic animals, one containing a transgene 
encoding a selected protein and the other containing a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut et al., 1997, Nature 385:810-3 and PCT 
Publication NOs. WO 97/07668 and WO 97/07669. 



IV. Pharmaceutical Compositions 

The nucleic acid molecules, polypeptides, and antibodies (also referred to herein as 
"active compounds") of the invention can be incorporated into pharmaceutical compositions 
suitable for administration. Such compositions typically comprise the nucleic acid 
molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein 
the language "pharmaceutically acceptable carrier" is intended to include any and all 
solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and 
absorption delaying agents, and the like, compatible with pharmaceutical administration. 
The use of such media and agents for pharmaceutically active substances is well known in 
the art. Except insofar as any conventional media or agent is incompatible with the active 
compoimd, use thereof in the compositions is contemplated. Supplementary active 
compounds can also be incorporated into the compositions. 

The invention includes methods for preparing pharmaceutical compositions for 
modulating the expression or activity of a polypeptide or nucleic acid of the invention. 
Such methods comprise formulating a pharmaceutically acceptable carrier with an agent 
which modulates expression or activity of a polypeptide or nucleic acid of the invention. 
Such compositions can further include additional active agents. Thus, the invention further 
includes methods for preparing a pharmaceutical composition by formulating a 
pharmaceutically acceptable carrier with an agent which modulates expression or activity of 
a polypeptide or nucleic acid of the invention and one or more additional active compounds. 
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The agent which modulates expression or activity may, for example, be a small 
molecule. For example, such small molecules include peptides, peptidomimetics, amino 
acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide 
analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic 
compounds) having a molecular weight less than about 10,000 grams per mole, organic or 
inorganic compounds having a molecular weight less than about 5,000 grams per mole, 
organic or inorganic compounds having a molecular weight less than about 1,000 grams per 
mole, organic or inorganic compounds having a molecular weight less than about 500 
grmis per mole, and salts, esters, and other pharmaceutically acceptable forms of such 
compounds. It is understood that appropriate doses of small molecule agents depends upon 
a number of factors within the ken of the ordinarily skilled physician, veterinarian, or 
researcher. The dose(s) of the small molecule will vary, for example, depending upon the 
identity, size, and condition of the subject or sample being treated, further depending upon 
the route by which the composition is to be administered, if applicable, and the effect which 
the practitioner desires the small molecule to have upon the nucleic acid or polypeptide of 
the invention. Exemplary doses include milligram or microgram amounts of the small 
molecule per kilogram of subject or sample weight {e,g. about 1 microgram per kilogram to 
about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 
milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per 
kilogram. It is furthermore understood that appropriate doses of a small molecule depend 
upon the potency of the small molecule with respect to the expression or activity to be 
modulated. Such appropriate doses may be determined using the assays described herein. 
When one or more of these small molecules is to be administered to an animal (e.g., a 
human) in order to modulate expression or activity of a polypeptide or nucleic acid of the 
invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively 
low dose at first, subsequently increasing the dose until an appropriate response is obtained. 
In addition, it is understood that the specific dose level for any particular animal subject will 
depend upon a variety of factors including the activity of the specific compound employed, 
the age, body weight, general health, gender, and diet of the subject, the time of 
administration, the route of administration, the rate of excretion, any drug combination, and 
the degree of expression or activity to be modulated, 

A pharmaceutical composition of the invention is formulated to be compatible with 
its intended route of administration. Examples of routes of administration include 
parenteral, e.g., intravenous, intradermal, subcutaneous, oral (^.g., inhalation), transdermal 
(topical), transmucosal, and rectal administration. Solutions or suspensions used for 
parenteral, intradermal, or subcutaneous application can include the following components: 
a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols. 
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glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl 
alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating 
agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or 
phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. 
pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. 
The parenteral preparation can be enclosed in ampules, disposable syringes or multiple dose 
vials made of glass or plastic. 

Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersions. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic water, Cremophor ELJ (BASF; 
Parsippany, NJ) or phosphate buffered saline (PBS). In all cases, the composition must be 
sterile and should be fluid to the extent that easy syringability exists. It must be stable 
under the conditions of manufacture and storage and must be preserved against the 
contaminating action of microorganisms such as bacteria and fungi. The carrier can be a 
solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, 
glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable 
mixtures thereof The proper fluidity can be maintained, for example, by the use of a 
coating such as lecithin, by the maintenance of the required particle size in the case of 
dispersion and by the use of surfactants. Prevention of the action of microorganisms can be 
achieved by various antibacterial and antifungal agents, for example, parabens, 
chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be 
preferable to include isotonic agents, for example, sugars, polyalcohols such as mannitol, 
sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable 
compositions can be brought about by including in the composition an agent which delays 
absorption, for example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound 
(e.g., a polypeptide or antibody) in the required amount in an appropriate solvent with one 
or a combination of ingredients enumerated above, as required, followed by filtered 
sterilization. Generally, dispersions are prepared by incorporating the active compound into 
a sterile vehicle which contains a basic dispersion medium and the required other 
ingredients from those enumerated above. In the case of sterile powders for the preparation 
of sterile injectable solutions, the preferred methods of preparation are vacuum drying and 
freeze-drying which yields a powder of the active ingredient plus any additional desired 
ingredient from a previously sterile-filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They can 
be enclosed in gelatin capsules or compressed into tablets. For the piirpose of oral 
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therapeutic administration, the active compound can be incorporated with excipients and 
used in the form of tablets, troches, or capsules. Oral compositions can also be prepared 
using a fluid carrier for use as a mouthwash, wherein the compound in the fluid carrier is 
applied orally and swished and expectorated or swallowed, 

Pharmaceutically compatible binding agents, and/or adjuvant materials can be 
^ included as part of the composition. The tablets, pills, capsules, troches and the like can 
contain any of the following ingredients, or compounds of a similar nature: a binder such as 
microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, 
a disintegrating agent such as alginic acid, Primogel, or com starch; a lubricant such as 
magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening 
agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl 
salicylate, or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from a pressurized container or dispenser which contains a suitable propellant, 
e.g. , a gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
dransmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art, and 
include, for example, for transmucosal administration, detergents, bile salts, and fiisidic acid 
derivatives- Transmucosal administration can be accomplished through the use of nasal 
sprays or suppositories. For transdermal administration, the active compounds are 
formulated into ointments, salves, gels, or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories {e.g,,w\th 
conventional suppository bases such as cocoa butter and other glycerides) or retention 
enemas for rectal delivery. 

25 

In one embodiment, the active compounds are prepared with carriers that will 
protect the compound against rapid elimination from the body, such as a controlled release 
formulation, including implants and microencapsulated delivery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, 
polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation 
of such formulations will be apparent to those skilled in the art. The materials can also be 
obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
suspensions (including liposomes targeted to infected cells with monoclonal antibodies to 
viral antigens) can also be used as pharmaceutically acceptable carriers. These can be 
prepared according to methods known to those skilled in the art, for example, as described 
inU.S. Patent No. 4,522,811. 
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It is especially advantageous to fonnulate oral or parenteral compositions in dosage 
unit form for ease of administration and uniformity of dosage. Dosage unit form as used 
herein refers to physically discrete units suited as unitary dosages for the subject to be 
treated; each unit containing a predetermined quantity of active compound calculated to 
produce the desired therapeutic effect in association with the required pharmaceutical 

^ carrier. The specification for the dosage unit forms of the invention are dictated by and 
directly dependent oh the unique characteristics of the active compound and the particular 
therapeutic effect to be achieved^ and the limitations inherent in the art of compounding 
such an active compound for the treatment of individuals. 

For antibodies, the preferred dosage is 0.1 mg/kg to 100 mg/kg of body weight 
(generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 
mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and folly 
human antibodies have a longer half-life within the human body than other antibodies. 
Accordingly, lower dosages and less frequent administration is often possible. 
Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake 

^ ^ and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is 
described by Cruikshank et al. (1997, Acquired Immune Deficiency Syndromes and 
Human Retrovirology 14:193). 

As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., 
an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 
0.01 to 25 mg/kg body weight, more preferably about 0. 1 to 20 mg/kg body weight, and 
even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 
6 mg/kg body weight. 

The skilled artisan will appreciate that certain factors may influence the dosage 
required to effectively treat a subject, including but not limited to the severity of the disease 
or disorder, previous treatments, the general health and/or age of the subject, and other 
diseases present. Moreover, treatment of a subject with a therapeutically effective amount 
of a protein, polypeptide, or antibody can include a single treatment or, preferably, can 
include a series of ti-eatments. In a preferred example, a subject is treated with antibody, 
protein, or polypeptide in the range of between about 0.1 to 20 mg/kg body weight, one 
time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more 
preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. 
It will also be appreciated that the effective dosage of antibody, protein, or polypeptide used 
for treatment may increase or decrease over the course of a particular treatment. Changes in 
dosage may result and become apparent from the results of diagnostic assays as described 
herein. 
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The nucleic acid molecules of the invention can be inserted into vectors and used as 
gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, 
intravenous injection, local administration (U.S. Patent 5,328,470) or by stereotactic 
injection {see, e.g., Chen et aL, 1994, Proc. Natl, Acad, Set. USA 91:3054-7), The 
pharmaceutical preparation of the gene therapy vector can include the gene therapy vector 
^ in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery 
vehicle is imbedded. Ahematively, where the complete gene delivery vector can be 
produced intact from recombinant cells, e.g. retroviral vectors, the pharmaceutical 
preparation can include one or more cells which produce the gene delivery system. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

V. Uses and Methods of the Invention 

The nucleic acid molecules, proteins, protein homologues, and antibodies described 
herein can be used in one or more of the following methods: a) screening assays; b) 
detection assays (e.g., chromosomal mapping, tissue typing, forensic biology); c) predictive 
medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and 
pharmacogenomics); and d) methods of treatment (e.g., therapeutic and prophylactic). For 
example, the INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378 polypeptides of the invention can to used to modulate 
cellular function, survival, morphology, proliferation, and/or differentiation of the cells in 
which they arc expressed. For example, the polypeptides of the invention can be used to 
treat diseases such as neoplastic disorders (e.g., cancer, tumors), hematopoietic disorders 
(e.g., T cell disorders), among others. The isolated nucleic acid molecules of the invention 
can be used to express proteins (e.g., via a recombinant expression vector in a host cell in 
gene therapy applications), to detect mRNA (eg., m a biological sample) or a genetic 
lesion, and to modulate activity of a polypeptide of the invention. In addition, the 
polypeptides of the invention can be used to screen drugs or compounds which modulate 
activity or expression of a polypeptide of the invention as well as to treat disorders 
characterized by insufficient or excessive production of a protein of the invention or 
production of a form of a protein of the invention which has decreased or aberrant activity 
compared to the wild type protein. In addition, the antibodies of the invention can be used 
to detect and isolate a protein of the invention and modulate activity of a protein of the 
invention. 

This invention further pertains to novel agents identified by the above-described 
screening assays and uses thereof for treatments as described herein. 
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A. Screening Assays 

The invention provides a method (also referred to herein as a "screening assay") for 
identifying modulators, i.e., candidate or test compounds or agents {e.g., peptides, 
peptidomimetics, small molecules or other drugs) which bind to polypeptide of the 
invention or have a stimulatory or inhibitory effect on, for example, expression or activity 
of a polypeptide of the invention. 

In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-bound form of a 
polypeptide of the invention or biologically active portion thereof. The test compounds of 
the present invention can be obtained using any of the numerous approaches in 
combinatorial library methods known in the art, including: biological libraries; spatially 
addressable parallel solid phase or solution phase libraries; synthetic library methods 
requiring deconvolution; the "one-bead one-compound" library method; and synthetic 
library methods using affinity chromatography selection. The biological library approach is 
limited to peptide libraries, while the other four approaches are applicable to peptide, non- 
peptide oligomer or small molecule libraries of compounds (Lam, 1997, Anticancer Drug 
Des. 12:145). 

Examples of methods for the synthesis of molecular libraries can be found in the art, 
for example in: DeWitt et al., 1993, Proc. NatL Acad. Sou USA 90:6909; Erb et al., 1994, 
Proc. NatL Acad. ScL USA 91 : 1 1422; Zuckermann et al, 1 994, J. Med. Chem. 37:2678; 
Cho et aL, 1993, Science 261:1303; Carrell et al., \99A, Angew. Chem, Int. Ed. Engl. 
33:2059; Carell et aL, 1994, Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et aL, 1994, 
J.Med. Chem. 37:1233. 

Libraries of compounds may be presented in solution (e.g., Houghten, 1992, 
Bio/Techniques 13:412-21), or on beads (Lam, 1991, Nature 354:82-4), chips (Fodor, 1993, 
Nature 364:555-6), bacteria (U.S. Patent No. 5,223,409), spores (U.S. Patent NOs. 
5,571,698; 5,403,484; and 5,223,409), plasmids (Cull et aL, 1992, Proc. NatL Acad ScL 
USA 89:1865-9) or phage (Scott and Smith. 1990, Science 249:386-90; Devlin, 1990, 
Science 249:404-6; Cwiria et aL, 1990, Proc. NatL Acad. ScL USA 87:6378-82; and Felici, 
1991, y. M?/. ^fo/. 222:301-10). 

In one embodiment, an assay is a cell-based assay in which a cell which expresses a 
membrane-bound form of a polypeptide of the invention, or a biologically active portion 
thereof, on the cell surface is contacted with a test compound and the ability of the test 
compound to bind to the polypeptide determined. The cell, for example, can be a yeast cell 
or a cell of mammalian origin. Determining the ability of the test compound to bind to the 
polypeptide can be accomplished, for example, by coupling the test compound with a 
radioisotope or enzymatic label such that binding of the test compound to the polypeptide or 
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biologically active portion thereof can be determined by detecting the labeled compound in 
a complex. For example, test compounds can be labeled with ^"l, ^^S, ^^C, or^H, either 
directly or indirectly, and the radioisotope detected by direct counting of radioemmission or 
by scintillation counting. Alternatively, test compounds can be enzymatically labeled with, 
for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic 
^ label detected by determination of conversion of an appropriate substrate to product. In a 
preferred embodiment, the assay comprises contacting a cell which expresses a membrane- 
bound form of a polypeptide of the invention, or a biologically active portion thereof, on the 
cell surface with a known compound which binds the polypeptide to foim an assay mixture, 
contacting the assay mixture with a test compound, and determining the ability of the test 
compound to interact with the polypeptide, wherein determining the ability of the test 
compound to interact with the polypeptide comprises determining the ability of the test 
compound to preferentially bind to the polypeptide or a biologically active portion thereof 
as compared to the known compound. 

In another embodiment, the assay involves assessment of an activity characteristic of 
the polypeptide, wherein binding of the test compound with the polypeptide or a 
biologically active portion thereof alters (e.g^., increases or decreases) the activity of the 
polypeptide. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 
expressing a membrane-bound form of a polypeptide of the invention, or a biologically 
active portion thereof, on the cell surface with a test compound and determining the ability 
of the test compound to modulate (e.g., stimulate or inhibit) the activity of the polypeptide 
or biologically active portion thereof. Determining the ability of the test compound to 
modulate the activity of the polj^eptide or a biologically active portion thereof can be 
accomplished, for example, by determining the ability of the polypeptide protein to bind to 
or mteract with a target molecule or to transport molecules across the cytoplasmic 
membrane. 

Determining the ability of a polypeptide of the invention to bind to or interact with a 
target molecule can be accomplished by one of the methods described above for 
determining direct binding. As used herein, a "target molecule" is a molecule with which a 
selected polj^eptide (e.g., a polypeptide of the invention binds or interacts with in nature, 
for example, a molecule on the surface of a cell which expresses the selected protein, a 
molecule on the surface of a second cell, a molecule in the extracellular milieu, a molecule 
associated with the internal surface of a cell membrane or a cytoplasmic molecule. A target 
molecule can be a polypeptide of the invention or some other polypeptide or protein. For 
example, a target molecule can be a component of a signal transduction pathway which 
facilitates transduction of an extracellular signal a signal generated by binding of a 
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compound to a polypeptide of the invention) through the cell membrane and into the cell or 
a second intercellular protein which has catalytic activity or a protein which facilitates the 
association of downstream signaling molecules with a pol3qDeptide of the invention. 
Determining the ability of a polypeptide of the invention to bind to or interact with a target 
molecule can be accomplished by determining the activity of the target molecule. For 
^ example, the activity of the target molecule can be determined by detecting induction of a 
cellular second messenger of the target (eg., intracellular Ca^"", diacylglycerol, IP3, etc.), 
detecting catalytic/en23miatic activity of the target on an appropriate substrate, detecting the 
induction of a reporter gene (e^,, a regulatory element that is responsive to a polypeptide of 
the invention operably linked to a nucleic acid encoding a detectable marker, €,g. 
luciferase), or detecting a cellular response^ for example, cellular differentiation, or cell 
proliferation. 

In yet another embodiment, an assay of the present invention is a cell-free assay 
comprising contacting a polypeptide of the invention or biologically active portion thereof 
with a test compound and determining the ability of the test compound to bind to the 
polypeptide or biologically active portion thereof Binding of the test compound to the 
polypeptide can be determined either directly or indirectly as described above. In a 
preferred embodiment, the assay includes contacting the polypeptide of the invention or 
biologically active portion thereof with a known compound which binds the polypeptide to 
form an assay mixture, contacting the assay mixture with a test compound, and determining 
the ability of the test compound to interact with die polypeptide, wherein determining the 
ability of the test compound to interact with the polypeptide comprises determining the 
ability of the test compound to preferentially bind to the polypeptide or biologically active 
portion thereof as compared to the known compound. 

In another embodiment, an assay is a cell-free assay comprising contacting a 

25 

polypeptide of the invention or biologically active portion thereof with a test compound and 
determining the ability of the test compound to modulate (e.g., stimulate or inhibit) the 
activity of the polypeptide or biologically active portion thereof Determining the ability of 
the test compound to modulate the activity of the polypeptide can be accomplished, for 
example, by determining the ability of the polypeptide to bind to a target molecule by one 
of the methods described above for determining direct binding. In an alternative 
embodiment, determining the ability of the test compound to modulate the activity of the 
polypeptide can be accomplished by determining the ability of the polypeptide of the 
invention to further modulate the target molecule. For example, the catalytic/enzymatic 
activity of the target molecule on an appropriate substrate can be determined as previously 
-^^ described. 
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In yet another embodiment, the cell-free assay comprises contacting a polypeptide of 
the invention or biologically active portion thereof with a known compound which binds the 
polypeptide to form an assay mixture, contacting the assay mixture with a test compound, 
and determining the ability of the test compound to interact with the polypeptide, wherein 
determining the ability of the test compound to interact with the polypeptide comprises 
determining the ability of the polypeptide to preferentially bind to or modulate the activity 
of a target molecule. 

The cell-free assays of the present invention are amenable to use of both a soluble 
form or the membrane-bound form of a polypeptide of the invention. In the case of cell-free 
assays comprising the membrane-bound form of the polypeptide, it may be desirable to 
utilize a solubilizing agent such that the membrane-bound form of the polypeptide is 
maintained in solution. Examples of such solubilizing agents include non-ionic detergents 
such as n-octylglucoside, n-dodecylglucoside, n-octylmaltoside, octanoyl-N- 
methylglucamide, decanoyl-N-methylglucamide, Triton X- 1 00, Triton X- 11 4, Thesit, 
Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamrninio]-l- 
propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-l- 
propane sulfonate (CHAPSO), or N-dodecyl-N,N-dimethyl-3-ammonio-l-propane 
sulfonate. 

In more than one embodiment of the above assay methods of the present invention, 
it may be desirable to immobilize either the polypeptide of the invention or its target 
molecule to facilitate separation of complexed from uncomplexed forms of one or both of 
the proteins, as well as to accommodate automation of the assay* Binding of a test 
compound to the polypeptide, or interaction of the polypeptide with a target molecule in the 
presence and absence of a candidate compound, can be accomplished in any vessel suitable 
for containing the reactants. Examples of such vessels include microtiter plates, test tubes, 
and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which 
adds a domain that allows one or both of the proteins to be bound to a matrix. For example, 
glutathione-S-transferase fusion proteins or glutathione-S-transferase ftision proteins can be 
adsorbed onto glutathione sepharose beads (Sigma Chemical; St. Louis, MO) or glutathione 
derivatized microtiter plates, which are then combined with the test compound or the test 
compound and either the non-adsorbed target protein or A polypeptide of the invention, and 
the mixture incubated under conditions conducive to complex formation (^,g,, at 
physiological conditions for salt and pH)* Following incubation, the beads or microtiter 
plate wells are washed to remove any unbound components and complex formation is 
measured either directly or indirectly, for example, as described above. Alternatively, the 
complexes can be dissociated from the matrix, and the level of binding or activity of the 
polypeptide of the invention can be determined using standard techniques. 
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Other techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either the polypeptide of the invention or 
its target molecule can be immobilized utilizing conjugation of biotin and streptavidin. 
Biotinylated polypeptide of the invention or target molecules can be prepared from biotin- 
NHS (N-hydroxy-succinimide) using techniques well known in the art (e.g., biotinylation 

^ kit, Pierce Chemicals; Rockford, IL), and immobilized in the wells of streptavidin-coated 96 
well plates (Pierce Chemical). Alternatively, antibodies reactive with the polypeptide of the 
invention or target molecules but which do not interfere with binding of the polypeptide of 
the invention to its target molecule can be derivatized to the wells of the plate, and unbound 
target or polypeptide of the invention trapped in the wells by antibody conjugation. 
Methods for detecting such complexes, in addition to those described above for the GST- 
immobilized complexes, include immunodetection of complexes using antibodies reactive 
with the polypeptide of the invention or target molecule, as well as enzyme-linked assays 
which rely on detecting an enzymatic activity associated with the polypeptide of the 
invention or target molecule. 

In another embodiment, modulators of expression of a polypeptide of the invention 
are identified in a method in which a cell is contacted with a candidate compound and the 
expression of the selected mRNA or protein (i.e., the mRNA or protein corresponding to a 
polypeptide or nucleic acid of the invention) in the cell is determined. The level of 
expression of the selected mRNA or protein in the presence of the candidate compound is 
compared to the level of expression of the selected mRNA or protein in the absence of the 
candidate compound. The candidate compound can then be identified as a modulator of 
expression of the polypeptide of the invention based on this comparison. For example, 
when expression of the selected mRNA or protein is greater (statistically significantly 
greater) in the presence of the candidate compound than in its absence, the candidate 

■ compound is identified as a stimulator of the selected mRNA or protein expression. 
Alternatively, when expression of the selected mRNA or protein is less (statistically 
significantly less) in the presence of the candidate compound than in its absence, the 
candidate compound is identified as an inhibitor of the selected mRNA or protein 
expression. The level of the selected mRNA or protein expression in the cells can be 

OA 

determined by methods described herein. 

In yet another aspect of the invention, a polypeptide of the inventions can be used as 
"bait proteins" in a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 
5,283,3 1 7; Zervos et al., 1 993, Cell 72:223-32; Madura et al., 1 993, J. Biol Chem, 
268:12046-54; Bartel et aL, 1993, Bio/Techniques 14:920-4; Iwabuchi et al, 1993, 
Oncogene 8:1693-6; and PCT Publication No. WO 94/10300), to identify other proteins, 
which bind to or interact with the polypeptide of the invention and modulate activity of the 
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polypeptide of the invention. Such binding proteins are also likely to be involved in the 
propagation of signals by the polypeptide of the inventions as, for example, upstream or 
downstream elements of a signaling pathway involving the polypeptide of the invention. 
This invention further pertains to novel agents identified by the above-described 
screening assays and uses thereof for treatments as described herein. 

5 

B. Detection Assavs 

Portions or fragments of the cDNA sequences identified herein (and the 
corresponding complete gene sequences) can be used in numerous ways as polynucleotide 
reagents. For example, these sequences can be used to: (i) map their respective genes on a 
chromosome and, thus, locate gene regions associated with genetic disease; (ii) identify an 
individual from a minute biological sample (tissue typing); and (iii) aid in forensic 
identification of a biological sample. These applications are described in the subsections 
below. 

1. Chromosome Mapping 

Once the sequence (or a portion of the sequence) of a gene has been isolated, this 
sequence can be used to map the location of the gene on a chromosome* Accordingly, 
nucleic acid molecules described herein or fragments thereof, can be used to map the 
location of the corresponding genes on a chromosome* The mapping of the sequences to 
'^^ chromosomes is an important first step in correlating these sequences with genes associated 
with disease. 

Briefly, genes can be mapped to chromosomes by preparing PGR primers 
(preferably 15-25 bp in length) from the sequence of a gene of the invention. Computer 
analysis of the sequence of a gene of the invention can be used to rapidly select primers that 
do not span more than one exon in the genomic DNA, thus complicating the amplification 
process. These primers can then be used for PGR screening of somatic cell hybrids 
containing individual human chronaosomes. Only those hybrids containing the human gene 
corresponding to the gene sequences will yield an amplified fragment. For a review of this 
technique, see D*Eustachio et al. (1983, Science 220:919-24). 

PGR mapping of somatic cell hybrids is a rapid procedure for assigning a particular 
sequence to a particular chromosome. Three or more sequences can be assigned per day 
using a single thermal cycler. Using the nucleic acid sequences of the invention to design 
oligonucleotide primers, sublocalization can be achieved with panels of fragments from 
specific chromosomes. Other mapping strategies which can similarly be used to map a gene 
to its chromosome include in situ hybridization (described in Fan et al., 1990, Froc. Natl. 
Acad Scl USA Zl:612i-1\ pre-screening with labeled flow-sorted chromosomes (CITE), 
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and pre-selection by hybridization to chromosome specific cDNA libraries. Fluorescence in 
ji/tt hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can 
further be used to provide a precise chromosomal location in one step. For a review of this 
technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques^ 1988, 
Pergamon Press, NY. 

^ Reagents for chromosome mapping can be used individually to mark a single 

chromosome or a single site on that chromosome, or panels of reagents can be used for 
marking riiultiple sites and/or multiple chromosomes. Reagents corresponding to 
noncoding regions of the genes actually are preferred for mapping purposes. Coding 
sequences are more likely to be conserved within gene families, thus increasing the chance 
of cross hybridizations during chromosomal mapping. 

Once a sequence has been mapped to a precise chromosomal location, the physical 
position of the sequence on the chromosome can be correlated with genetic map data. 
(Such data are found, for exsmple, in V. McKusick, Mendelian Inheritance in Man, 
available on-line through Johns Hopkins University Welch Medical Library). The 
relationship between genes and disease, mapped to the same chromosomal region, can then 
be identified through linkage analysis (co-inheritance of physically adjacent genes), 
described in, e.g,, Egeland et al, 1987, Nature ilS:lZ2>-l, 

Moreover, differences in the DNA sequences between individuals affected and 
unaffected with a disease associated with a gene of the invention can be determined. If a 
mutation is observed in some or all of the affected individuals but not in any unaffected 
individuals, then the mutation is likely to be the causative agent of the particular disease. 
Comparison of affected and unaffected individuals generally involves first looking for 
structural aherations in the chromosomes such as deletions or translocations that are visible 
from chromosome spreads or detectable using PCR based on that DNA sequence. 
Ultimately, complete sequencing of genes from several individuals can be performed to 
confirm the presence of a mutation and to distinguish mutations from polymorphisms. 

Furthermore, the nucleic acid sequences disclosed herein can be used to perform 
searches against "mapping databases", e.g., BLAST-type search, such that the chromosome 
position of the gene is identified by sequence homology or identity with known sequence 

on 

fragments which have been mapped to chromosomes. 

A pol>peptide and fragments and sequences thereof and antibodies specific thereto 
can be used to map the location of the gene encoding the polypeptide on a chromosome. 
This mapping can be carried out by specifically detecting the presence of the polypeptide in 
members of a panel of somatic cell hybrids between cells of a first species of animal firom 
which the protein originates and cells from a second species of animal and then determining 
which somatic cell hybrid(s) expresses the polypeptide and noting the chromosome(s) from 

-99- 



BNSDOCID: <WO__0100673A1J_^ 



the first species of animal that it contains. For examples of this technique, see Pajunen et 
al., 1988, Cytogenet Cell Genet, 47:37-41 and Van Keuren et al, 1986, Hum. Genet 
74:34-40. Alternatively, the presence of the polypeptide in the somatic cell hybrids can be 
determined by assaying an activity or property of the polypeptide, for example, enzymatic 
activity, as described in Bordelon-Riser et al, 1979, Somatic Cell Genetics 5:597-613 and 
^ Owerbach et al, 1978, Proc. NatL Acad. Sci. USA 75:5640-5644. 

2. Tissue Typing 

The nucleic acid sequences of the present invention can also be used to identify 
individuals from minute biological samples. The United States military, for example, is 
considering the use of restriction fragment length polymorphism (RFLP) for identification 
of its personnel In this technique, an individuaFs genomic DNA is digested with one or 
more restriction enzymes, and probed on a Southern blot to yield unique bands for 
identification. This method does not suffer from the current limitations of "Dog Tags" 
which can be lost, switched, or stolen, making positive identification difficult. The 
sequences of the present invention are useful as additional DNA markers for RFLP 
(described in U.S. Patent 5,272,057). 

Furthermore, the sequences of the present invention can be used to provide an 
alternative technique which determines the actual base-by-base DNA sequence of selected 
portions of an individuaFs genome. Thus, the nucleic acid sequences described herein can 
be used to prepare two PGR primers from the 5* and 3* ends of the sequences. These 
primers can then be used to amplify an individual's DNA and subsequently Sequence it. 

Panels of corresponding DNA sequences from individuals, prepared in this manner, 
can provide unique individual identifications, as each individual will have a unique set of 
such DNA sequences due to allelic differences. The sequences of the present invention can 
be used to obtain such identification sequences from individuals and from tissue. The 
nucleic acid sequences of the invention uniquely represent portions of the human genome. 
Allelic variation occurs to some degree in the coding regions of these sequences, and to a 
greater degree in the noncoding regions. It is estimated that allelic variation between 
individual humans occurs with a frequency of about once per each 500 bases. Each of the 
sequences described herein can, to some degree, be used as a standard against which DNA 
from an individual can be compared for identification purposes. Because greater numbers 
of polymorphisms occur in the noncoding regions, fewer sequences are necessary to 
differentiate individuals. The noncoding sequences of SEQ ID NOs:l, 4, 7, 10, 13, 16, 19, 
22, 25, and 28 can comfortably provide positive individual identification with a panel of 
•^•^ perhaps 1 0 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. 
If predicted coding sequences, such as those in SEQ ID NOs:3, 6, 9, 12, 15, 18, 21, 24, 27, 
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and 30 are used, a more appropriate number of primers for positive individual identification 
would be 500-2,000. 

If a panel of reagents from the nucleic acid sequences described herein is used to 
generate a unique identification database for an individual, those same reagents can later be 
used to identify tissue from that individual. Using the unique identification database, 
positive identification of the individual, Hving or dead, can be made firom extremely small 
tissue samples. 

3. Use of Partial Gene Sequences in Forensic Bioloev 

DNA-based identification techniques can also be used in forensic biology. Forensic 
biology is a scientific field employing genetic typing of biological evidence found at a 
crime scene as a means for positively identifying, for example, a perpetrator of a crime. To 
make such an identification, PGR technology can be used to amplify DNA sequences taken 
fi'om very small biological samples such as tissues, e.g., hair or skin, or body fluids, 
blood, saliva, or semen found at a crime scene. The amplified sequence can then be 
^ ^ compared to a standard, thereby allowing identification of the origin of the biological 
sample. 

The sequences of the present invention can be used to provide polynucleotide 
reagents, e,g,, PGR primers, targeted to specific loci in the human genome, which can 
enhance the reliability of DNA-based forensic identifications by, for example, providing 

20 

another "identification rnafker" (i.e. another DNA sequence that is unique to a particular 
individual). As mentioned above, actual base sequence information can be used for 
identification as an accurate alternative to patterns formed by restriction enzyme generated 
fragments. Sequences targeted to noncoding regions are particularly appropriate for this use 
as greater numbers of polymorphisms occur in the noncoding regions, making it easier to 
differentiate individuals using this technique. Examples of polynucleotide reagents include 
the nucleic acid sequences of the invention or portions thereof, e.g., fragments derived from 
noncoding regions having a length of at least 20 or 30 bases. 

The nucleic acid sequences described herein can fiirther be used to provide 
polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, 

OA 

an in situ hybridization technique, to identify a specific tissue, e.g. , brain tissue. This can 
be very useful in cases where a forensic pathologist is presented with a tissue of unknown 
origin. Panels of such probes can be used to identify tissue by species and/or by organ type. 



35 



G. Predictive Medicine: 

The present invention also pertains to the field of predictive medicine in which 
diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic 

-101- 



BNSDOCID: <W0 ^0100673A1J„> 



(predictive) purposes to thereby treat an individual prophylactically. Accordingly* one 
aspect of the present invention relates to diagnostic assays for determining INTERCEPT 
340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 
378 protein and/or nucleic acid expression as well as INTERCEPT 340, MANGO 003, 
MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 378 activity, in the 
context of a biological sample {e.g.^ blood, serum, cells, tissue) to thereby deternaine 
whether an individual is afflicted with a disease or disorder, or is at risk of developing a 
disorder, associated with aberrant or unwanted INTERCEPT 340, MANGO 003, MANGO 
347, TANGO 272, TANGO 295, TANGO 354, or TANGO 378 gene expression or activity. 
The invention also provides for prognostic (or predictive) assays for determining whether 
an individual is at risk of developing a disorder associated with INTERCEPT 340, 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 378 
protein or nucleic acid expression or activity. For example, mutations in a gene can be 
assayed in a biological sample. Such assays can be used for prognostic or predictive 
purpose to thereby prophylactically treat an individual prior to the onset of a disorder 
characterized by or associated with protein or nucleic acid expression or activity. 

As an alternative to making determinations based on the absolute expression level of 
selected genes, determinations may be based on the normalized expression levels of these 
genes. Expression levels are normalized by correcting the absolute expression level of a 
INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, 
or TANGO 378 gene by comparing its expression to the expression of a gene that is not a 
INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, 
or TANGO 378, e.g., a housekeeping gene that is constitutively expressed. Suitable genes 
for normalization include housekeeping genes such as the actin gene. This nomialization 
allows the comparison of the expression level in one sample, e.g., a patient sample, to 
another sample, e.g., a non-disease sample, or between samples from different sources. 

Alternatively, the expression level can be provided as a relative expression level. To 
determine a relative expression level of a gene, the level of expression of the gene is 
detennined for 10 or more samples of different cell isolates, preferably 50 or more smnples, 
prior to the determination of the expression level for the sample in question. The mean 
expression level of each of the genes assayed in the larger number of samples is determined 
and this is used as a baseline expression level for the gene(s) in question. The expression 
level of the gene determined for the test sample (absolute level of expression) is then 
divided by the mean expression value obtained for that gene. This provides a relative 
expression level and aids in identifying extreme cases of disease. 

Preferably, the samples used in the baseline determination will be from diseased or 
from non-diseased cells of tissue. The choice of the cell source is dependent on the use of 
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the relative expression level. Using expression found in normal tissues as a mean 
expression score aids in validating whether the INTERCEPT 340, MANGO 003, MANGO 
347, TANGO 272, TANGO 295, TANGO 354, or TANGO 378 gene assayed is diseased 
cell-type specific (versus normal cells). Such a use is particularly important in identifying 
whether a INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, or TANGO 378 gene can serve as a target gene. In addition, as more data is 
accumulated, the mean expression value can be revised, providing improved relative 
expression values based on accumulated data. Expression data from cells provide a means 
for grading the severity of the disease state. 

Another aspect of the invention pertains to monitoring the influence of agents (eg,, 
drugs, compounds) on the expression or activity of INTERCEPT 340, MANGO 003, 
MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 378 genes in clinical 
trials. 

These and other agents are described in further detail in the following sections. 

1. Diagnostic Assavs 

An exemplary method for detecting the presence or absence of a polypeptide or 
nucleic acid of the invention in a biological sample involves obtaining a biological sample 
from a test subject and contacting the biological sample with a compound or an agent 
capable of detecting a polypeptide or nucleic acid (e.g., mRNA, genomic DNA) of the 
invention such that the presence of a polypeptide or nucleic acid of the invention is detected 
in the biological sample. A preferred agent for detecting mRNA or genomic DNA 
encoding a polypeptide of the invention is a labeled nucleic acid probe capable of 
hybridizing to mRNA or genomic DNA encoding a polypeptide of the invention. The 
nucleic acid probe can be, for example, a full-length cDNA, such as the nucleic acid of SEQ 
ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a portion 
thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in 
length and sufficient to specifically hybridize under stringent conditions to a mRNA or 
genomic DNA encoding a polypeptide of the invention. Other suitable probes for use in the 
diagnostic assays of the invention are described herein. 

30 

A preferred agent for detecting a polypeptide of the invention is an antibody capable 
of binding to a polypeptide of the invention, preferably an antibody with a detectable label. 
Antibodies can be polyclonal, or more preferably, monoclonal An intact antibody, or a 
fragment thereof (e,g,. Fab or F(ab')2) can be used- The term "labeled", with regard to the 
probe or antibody, is intended to encompass direct labeling of the probe or antibody by 
coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as 
indirect labeling of the probe or antibody by reactivity with another reagent that is directly 
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labeled. Examples of indirect labeling include detection of a primary antibody using a 
fluorescently labeled secondary antibody and end-labeling of a DNA probe with biotin such 
that it can be detected with fluorescently labeled streptavidin. The term "biological sample" 
is intended to include tissues, cells and biological fluids isolated from a subject, as well as 
tissues, cells and fluids present within a subject. That is, the detection method of the 
invention can be used to detect mRNA, protein, or genomic DNA in a biological sample in 
vitro as well as in vivo. For example, in vitro techniques for detection of mKNA include 
Northern hybridizations and in situ hybridizations. In vitro techniques for detection of a 
polypeptide of the invention include enzyme linked immunosorbent assays (ELISAs), 
Western blots, immunoprecipitations and immunofluorescence. In vitro techniques for 
detection of genomic DNA include Southern hybridizations. Furthermore, in vivo 
techniques for detection of a polypeptide of the invention include introducing into a subject 
a labeled antibody directed against the polypeptide. For example, the antibody can be 
labeled with a radioactive marker whose presence and location in a subject can be detected 
by standard imagihg techniques. 

In one embodiment, the biological sample contains protein molecules from the test 
subject. Alternatively, the biological sample can contain mRNA molecules from the test 
subject or genomic DNA molecules from the test subject. A preferred biological sample is 
a peripheral blood leukocyte sample isolated by conventional means from a subject. 

In another embodiment, the methods further involve obtaining a control biological 
sample fix)m a control subject, contacting the control sample with a compound or agent 
capable of detecting a polypeptide of the invention or mRNA or genomic DNA encoding a 
polypeptide of the invention, such that the presence of the polypeptide or mRNA or 
genomic DNA encoding the polypeptide is detected in the biological sample, and 
comparing the presence of the polypeptide or mRNA or genomic DNA encoding the 
polypeptide in the control sample with the presence of the polypeptide or mRNA or 
genomic DNA encoding the pol>peptide in the test sample. 

The invention also encompasses kits for detecting the presence of a polypeptide or 
nucleic acid of the invention in a biological sample (a test sample). Such kits can be used to 
determine if a subject is suffering from or is at increased risk of developing a disorder 
associated with aberrant expression of a polypeptide of the invention (e.g., a proliferative 
disorder, e.g.y psoriasis or cancer). For example, the kit can comprise a labeled compound 
or agent capable of detecting the polypeptide or mRNA encoding the polypeptide in a 
biological sample and means for determining the amount of the polypeptide or mRNA in 
the sample (e.g., an antibody which binds the polypeptide or an oligonucleotide probe 
which binds to DNA or mRNA encoding the polypeptide). Kits can also include 
instructions for observing that the tested subject is suffering from or is at risk of developing 
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a disorder associated with aberrant expression of the polypeptide if the amount of the 
polypeptide or mRNA encoding the polypeptide is above or below a normal level. 

For antibody-based kits, the kit can comprise, for example: (1) a first antibody {e.g., 
attached to a solid support) which binds to a polypeptide of the invention; and, optionally, 
(2) a second, different antibody which binds to either the polypeptide or the first antibody 
^ and is conjugated to a detectable agent. 

For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
acid sequence encoding a polypeptide of the invention or (2) a pair of primers useful for 
amplifying a nucleic acid molecule encoding a polypeptide of the invention. The kit can 
also comprise, e.g., a buffering agent, a preservative, or a protein stabilizing agent. The kit 
can also comprise components necessary for detecting the detectable agent (e.g., an enzyme 
or a substrate). The kit can also contain a control sample or a series of control samples 
which can be assayed and compared to the test sample contained. Each component of the 
kit is usually enclosed within an individual container and all of the various containers are 
within a single package along with instructions for observing whether the tested subject is 
suffering from or is at risk of developing a disorder associated with aberrant expression of 
the polypeptide. 

2. Promostic Assays 

20 

The methods described herein can furthermore be utilized as diagnostic or 
prognostic assays to identify subjects having or at risk of developing a disease or disorder 
associated with aberrant expression or activity of a polypeptide of the invention. For 
example, the assays described herein, such as the preceding diagnostic assays or the 
following assays, can be utilized to identify a subject having or at risk of developing a 

^ disorder associated with aberrant expression or activity of a polypeptide of the invention. 
Alternatively, the prognostic assays can be utilized to identify a subject having or at risk for 
developing such a disease or disorder. Thus, the present invention provides a method in 
which a test sample is obtained from a subject and a polypeptide or nucleic acid (e.g., 
mRNA, genomic DNA) of the invention is detected, wherein the presence of the 
polypeptide or nucleic acid is diagnostic for a subject having or at risk of developing a 
disease or disorder associated with aberrant expression or activity of the polypeptide. As 
used herein, a "test sample" refers to a biological sample obtained from a subject of interest. 
For example, a test sample can be a biological fluid (e.g., serum), cell sample, or tissue. 
Furthermore, the prognostic assays described herein can be used to determine 

•^"^ whether a subject can be administered ah agent (e.g., an agonist, antagonist, 

peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to 
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treat a disease or disorder associated with aberrant expression or activity of a polypeptide of 
the invention. For example, such methods can be used to determine whether a subject can 
be effectively treated with a specific agent or class of agents {e.g., agents of a type which 
decrease activity of the polypeptide). Thus, the present invention provides methods for 
determining whether a subject can be effectively treated with an agent for a disorder 
^ associated with aberrant expression or activity of a polypeptide of the invention in which a 
test sample is obtained and the polypeptide or nucleic acid encoding the polypeptide is 
detected wherein the presence of the polypeptide or nucleic acid is diagnostic for a 
subject that can be administered the agent to treat a disorder associated with aberrant 
expression or activity of the polypeptide). 

The methods of the invention can also be used to detect genetic lesions or mutations 
in a gene of the invention, thereby determining if a subject with the lesioned gene is at risk 
for a disorder characterized aberrant expression or activity of a polypeptide of the invention* 
In preferred embodiments, the methods include detecting, in a sample of cells from the 
subject, the presence or absence of a genetic lesion or mutation characterized by at least one 
of an alteration affecting the integrity of a gene encoding the polypeptide of the invention, 
or the mis-expression of the gene encoding the polypeptide of the invention. For example, 
such genetic lesions or mutations can be detected by ascertaining the existence of at least 
one of: 1) a deletion of one or more nucleotides from the gene; 2) an addition of one or 
more nucleotides to the gene; 3) a substitution of one or more nucleotides of the gene; 4) a 
chromosomal rearrangement of the gene; 5) an alteration in the level of a messenger RNA 
transcript of the gene; 6) an aberrant modification of the gene, such as of the methylation 
pattern of the genomic DNA; 7) the presence of a non-wild type splicing pattern of a 
messenger RNA transcript of the gene; 8) a non-wild type level of a the protein encoded by 
the gene; 9) an allelic loss of the gene; and 10) an inappropriate post-translational 
modification of the protein encoded by the gene. As described herein, there are a large 
number of assay techniques known in the art which can be used for detecting lesions in a 
gene. 

In certain embodiments, detection of the lesion involves the use of a probe/primer in 
a polymerase chain reaction (PGR) {see, e.g., U.S. Patent NOs. 4,683,195 and 4,683,202), 
such as anchor PGR or RACE PGR, or, alternatively, in a ligation chain reaction (LGR) 
{see, e.g. , Landegran et ah, 1 988, Science 24 1 : 1 077-80; and Nakazawa et al, 1 994, Proc. 
Natl. Acad. Sci. USA 91 :360-4), the latter of which can be particularly useful for detecting 
point mutations in a gene {see, e.g., Abravaya et aL, 1995, Nucleic Acids Res, 23:675-82). 
This method can include the steps of collecting a sample of cells from a patient, isolating 
nucleic acid {e.g., genomic, mRNA or both) from the cells of the sample, contacting the 
nucleic acid sample with one or more primers which specifically hybridize to the selected 
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gene under conditions such that hybridization and amplification of the gene (if present) 
occurs, and detecting the presence or absence of an amplification product, or detecting the 
size of the amplification product and comparing the length to a control sample. It is 
anticipated that PGR and/or LCR may be desirable to use as a preliminary amplification 
step in conjunction with any of the techniques used for detecting mutations described 
^ herein. 

Alternative amplification methods include: self sustained sequence replication 
(Guatelli et al., 1 990, Proc. Natl Acad. ScL USA 87: 1 874-78), transcriptional amplification 
system (Kwoh, et al., 1989, Proc. Natl. Acad. Set USA 86:1 173-7), Q-Beta Replicase 
(Lizardi et al., 1988, Bio/Technology 6:1197), or any other nucleic acid amplification 
method, followed by the detection of the amplified molecules using techniques well known 
to those of skill in the art. These detection schemes are especially usefiil for the detection 
of nucleic acid molecules if such molecules are present in very low numbers. 

In an alternative embodiment, mutations in a selected gene from a sample cell can 
be identified by alterations in restriction enzjone cleavage patterns* For example, sample 
and control DNA is isolated, amplified (optionally), digested with one or more restriction 
endonucleases, and ft*agment length sizes are determined by gel electrophoresis and 
compared. Differences in fragment length sizes between sample and control DNA indicates 
mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (seCy e.g., 
U.S. Patent No. 5,498,531) can be used to score for the presence of specific mutations by 

20 

development or loss of a nbozyme cleavage site. 

In other embodiments, genetic mutations can be identified by hybridizing a sample 
and control nucleic acids, e,g,, DNA or RNA, to high density arrays containing hundreds or 
thousands of oligonucleotides probes (Cronin et al., 1996, Human Mutation 7:244-55; 
Kozai et ah, 1996, Nature Medicine 2:753-9). For example, genetic mutations can be 
identified in two-dimensional arrays containing light-generated DNA probes as described in 
Cronin et al., supra. Briefly, a first hybridization array of probes can be used to scan 
through long stretches of DNA in a sample and control to identify base changes between the 
sequences by making linear arrays of sequential overlapping probes. This step allows the 
identification of point mutations. This step is followed by a second hybridization array that 

30 

allows the characterization of specific mutations by using smaller, specialized probe arrays 
complementary to all variants or mutations detected. Each mutation array is composed of 
parallel probe sets, one complementary to the wild- type gene and the other complementary 
to the mutant gene. 

In yet another embodiment, any of a variety of sequencing reactions known in the 
art can be used to directly sequence the selected gene and detect mutations by companng 
the sequence of the sample nucleic acids with the corresponding wild-type (control) 
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seiquehce. Examples of sequencing reactions include those based on techniques developed 
by Maxim and Gilbert (1977, Proc, Natl, Acad. ScL USA 74:560) or Sanger (1977, Proc. 
Natl. Acad ScL USA 74:5463). It is also contemplated that any of a variety of automated 
sequencing procedures can be utilized when performing the diagnostic assays developed by 
Naeve et al. (1995, Bio/Techniques 19:448-53), including sequencing by mass spectrometry 
^ {see, e.g., PCT Publication No, WO 94/16101 ; Cohen et al, 1996, Adv. Chromatogr. 
36:127-62; and Griffin et aL, 1993,^/?/?/. Biochem. Biotechnol 38:147-59). 

Other methods for detecting mutations in a selected gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in KNA/RNA or 
RNA/DNA heteroduplexes (Myers et al, 1985, Science 230: 1242). In general, the 
technique of mismatch cleavage entails providing heteroduplexes formed by hybridizing 
(labeled) KNA or DNA containing the wild-type sequence with potentially mutant RNA or 
DNA obtained from a tissue sample. The double-stranded duplexes are treated with an 
agent which cleaves single-stranded regions of the duplex such as which will exist due to 
basepair mismatches between the control aiid sample strands. RNA/DNA duplexes can be 
treated with RNase to digest mismatched regions, and DNA/DNA hybrids can be treated 
with SI nuclease to digest mismatched regions. 

In other embodiments, either DNA/DNA or RNA/DNA duplexes can be treated with 
hydroxylamine or osmium tetroxide and with piperidine in order to digest mismatched 
regions. After digestion of the mismatched regions, the resulting material is then separated 
by size on denaturing polyacrylamide gels to determine the site of mutation. See, e.g.. 
Cotton et al., 1988, Proc. NatL Acad. Sci. USA 85:4397; Saleeba et aL, 1992, Methods 
EnzymoL 217:286-95. In a preferred embodiment, the control DNA or RNA can be labeled 
for detection. 

In still another embodiment, the mismatch cleavage reaction employs one or more 
protems that recognize mismatched base pairs in double-stranded DNA (so called DNA 
mismatch repiair enzymes) in defined systems for detecting and mapping point mutations in 
cDNAs obtained from samples of cells. For example, the mutY enzyme ofE. colt cleaves 
A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at 
G/T mismatches (Hsu et al., 1994, Carcinogenesis 15:1657-62). According to an 

OA 

^ exemplary embodiment, a probe based on a selected sequence, e.g., a wild-type sequence, is 
hybridized to a cDNA or other DNA product from a test cell(s). The duplex is treated with 
a DNA mismatch repair enzyme, and the cleavage products, if any, can be detected from 
electrophoresis protocols or the like. See, e.g., U.S. Patent No, 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used to identify 
mutations in genes. For example, single strand conformation polymorphism (SSCP) may 
be used to detect differences in electrophoretic mobility between mutant and wild type 
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nucleic acids (Orita et al., 1989. Proc, Natl. Acad ScL USA 86:2766; see also Cotton, 1993, 
Mutat Res. 285:125-44; Hayashi, 1992, Genet Anal. Tech. Appl. 9:73-9). Single-stranded 
DNA fragments of sample and control nucleic acids will be denatured and allowed to 
renature. The secondary structure of single-stranded nucleic acids varies according to 
sequence, and the resulting alteration in electrophoretic mobility enables the detection of 
even a single base change. The DNA fragments may be labeled or detected with labeled 
probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in 
which the secondary structure is more sensitive to a change in sequence. In a preferred 
embodiment, the subject method utilizes heteroduplex analysis to separate double stranded 
heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al., 
1991, Trends GeneL 7:5). 

In yet another embodiment, the movement of mutant or wild-type fragments in 
polyacrylamide gels containing a gradient of denaturant is assayed using denaturing 
gradient gel electrophoresis (DGGE) (Myers et al, 1985, Nature 313:495). When DGGE is 
used as the method of analysis, DNA will be modified to insure that it does not completely 
denature, for example by adding a 'GC clamp of approximately 40 bp of high-melting GC- 
rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a 
denaturing gradient to identify differences in the mobility of control and sample DNA 
(Rosenbaum and Reissner, 1987, Biophys. Chem. 265:12753). 

Examples of other techniques for detecting point mutations include, but are not 
limited to, selective oligonucleotide hybridization, selective amplification, or selective 
primer extension. For example, oligonucleotide primers may be prepared in which the 
known mutation is placed centrally and then hybridized to target DNA under conditions 
which permit hybridization only if a perfect match is found (Saiki et al., 1986, Nature 
324:163; Saiki et al., 1989, Proa, Natl. Acad. Sci. USA 86:6230). Such allele specific 
oligonucleotides are hybridized to PCR amplified target DNA or a number of different 
mutations when the oligonucleotides are attached to the hybridizing membrane and 
hybridized with labeled target DNA. 

Alternatively, allele specific amplification technology which depends on selective 
iPCR amplification may be used in conjunction with the instant invention. Oligonucleotides 
used as primers for specific amplification may carry the mutation of interest in the center of 
the molecule (so that amplification depends on differential hybridization; Gibbs et al., 1989, 
Nucleic Acids Res. 17:2437-48) or at the extreme 3* end of one primer where, under 
appropriate conditions, mismatch can prevent or reduce polymerase extension (Prossner, 
1993, Tibtech 1 1 :238). In addition, it may be desirable to introduce a novel restriction site 
in the region of the mutation to create cleavage-based detection (Gasparini et al., 1992, Mol 
Cell Probes 6: 1). It is anticipated that in certain embodiments amplification may also be 
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performed using Taq ligase for amplification (Barany, 1991, Proc. NatL Acad, Scl USA 
88:189). In such cases, ligation will occur only if there is a perfect match at the 3* end of 
the 5' sequence making it possible to detect the presence of a known mutation at a specific 
site by looking for the presence or absence of amplification. 

The methods described herein may be performed, for example, by utilizing pre- 

^ packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent 
described herein, which may be conveniently used, in clinical settings to diagnose 
patients exhibiting symptoms or family history of a disease or illness involving a gene 
encoding a polypeptide of the invention. Furthermore, any cell type or tissue, preferably 
peripheral blood leukocytes, in which the polypeptide of the invention is expressed may be 

^ ^ utilized in the prognostic assays described herein. 

3. Pharmacogenomics 

Agents, or modulators which have a stimulatory or inhibitory effect on activity or 
expression of a polypeptide of the invention as identified by a screening assay described 
^ ^ herein can be administered to individuals to treat (prophylactically or therapeutically) 
disorders associated with aberrant activity of the polypeptide. In conjunction with such 
treatment, the pharmacogenomics (i.e., the study of the relationship between an individual's 
genotype and that individual's response to a foreign compound or drug) of the individual 
may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or 
therapeutic failure by altering the relation between dose and blood concentration of the 
pharmacologically active drug. Thus, die pharmacogenomics of the individual permits the 
selection of effective agents {e.g., drugs) for prophylactic or therapeutic treatments based on 
a consideration of the individuaPs genotype. Such pharmacogenomics can further be used 
to determine appropriate dosages and therapeutic regimens. Accordingly, the activity of a 
polypeptide of the invention, expression of a nucleic acid of the invention, or mutation 
content of a gene of the invention in an individual can be determined to thereby select 
appropriate agent(s) for therapeutic or prophylactic treatment of the individual, 

Pharmacogenomics deals with clinically significant hereditary variations in the 
response to drugs due to altered drug disposition and abnormal action in affected persons. 
See, e,g,y Linder, 1997, Clin. Chem. 43(2):254-66. In general, two types of 
pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a 
single factor altering the way drugs act on the body are referred to as "altered drug action." 
Genetic conditions transmitted as single factors altering the way the body acts on drugs are 
referred to as "altered drug metabolism". These pharmacogenetic conditions can occur 
either as rare defects or as polymorphisms. For example, glucose-6-phosphate 
dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main 

-110- 



BNSDOCID: <WO_01 00673A1J_> 



clinical complication is haiemolysis after ingestion of oxidant dfugs (anti-malarials, 
sulfonamidesj analgesics, nitrofurans) and consumption of fava beans. 

As an illustrative embodiment, the activity of drug metabolizing enzymes is a major 
determinant of both the intensity and duration of drug action. The discovery of genetic 
polymorphisms of drug metabolizing enzymes {e,g,, N-acetyltransferase 2 (NAT 2) and 
^ cytochrome P450 enzymes CYP2D6 and CYP2C1 9) has provided an explanation as to why 
some patients do hot obtain the expected drug effects or show exaggerated drug response 
and serious toxicity after taking the standard and safe dose of a drug. These polymorphisms 
are expressed in two phenotypes in the population, the extensive metabolizer (EM) and poor 
metabolizer (PM). The prevalence of PM is different among different populations. For 
example, the gene coding for CYP2D6 is highly polymorphic and several mutations have 
been identified in PM, which all lead to the absence of functional CYP2D6. Poor 
metabolizers of CYP2D6 and CYP2C19 quite frequently experience exaggerated drug 
response and side effects when they receive standard doses. If a metabolite is the active 
therapeutic moiety, a PM will show no therapeutic response, as demonstrated for the 
analgesic effect of codeine mediated by its CYP2D6-formed metabolite morphine. The 
other extreme are the so called ultra-rapid metabolizers who do not respond to standard 
doses. Recently, the molecular basis of ultra-rapid metabolism has been identified to be due 
to CYP2D6 gene amplification. 

Thus, the activity of a polypeptide of the invention, expression of a nucleic acid 

20 

encoding the polypeptide, or mutation content of a gene encoding the polypeptide in an 
individual can be determined to thereby select appropriate agent(s) for therapeutic or 
prophylactic treatment of the individual. In addition, pharmacogenetic studies can be used 
to apply genotyping of polymorphic alleles encoding drug-metabolizing enzymes to the 
identification of an individual's drug responsiveness phenotype. This knowledge, when 
applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and 
thus enhance therapeutic or prophylactic efficiency when treating a subject with a 
niodulator of activity or expression of the polypeptide, such as a modulator identified by 
one of the exemplary screening assays described herein. 

4. Monitoring of Effects During Clinical Trials 

Monitoring the influence of agents (eg., drugs, compounds) on the expression or 
activity of a polypeptide of the invention (e.g., the ability to modulate aberrant cell 
proliferation chemotaxis, and/or differentiation) can be applied not only in basic drug 
screening, but also in clinical trials. For example, the effectiveness of an agent, as 
•^•^ determined by a screening assay as described herein, to increase gene expression, protein 
levels or protein activity, can be monitored in clinical trials of subjects exhibiting decreased 
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gene expression, protein levels, or protein activity. Alternatively, the effectiveness of an 
agent, as determined by a screening assay, to decrease gene expression, protein levels or 
protein activity, can be monitored in clinical trials of subjects exhibiting increased gene 
expression, protein levels, or protein activity. In such clinical trials, expression or activity 
of a polypeptide of the invention and preferably, that of other polypeptide that have been 

^ implicated in for example, a cellular proliferation disorder, can be used as a marker of the 
immune responsiveness of a particular ceil. 

For example, and not by way of limitation, genes, including those of the invention, 
that are modulated in cells by treatment with an agent (e^*, compound, drug or small 
molecule) which modulates activity or expression of a polypeptide of the invention (e.g., as 
identified in a screening assay described herein) can be identified. Thus, to study the effect 
of agents on cellular proliferation disorders, for example, in a clinical trial, cells can be 
isolated and RNA prepared and analyzed for the levels of expression of a gene of the 
invention and other genes implicated in the disorder. The levels of gene expression (i.e., a 
gene expression pattern) can be quantified by Northern blot analysis or RT-PCR, as 

^ ^ described herein, or alternatively by measuring the amount of protein produced, by one of 
the methods as described herein, or by measuring the levels of activity of a gene of the 
invention or other genes. In this way, the gene expression pattern can serve as a marker, 
indicative of the physiological response of the cells to the agent. Accordingly, this response 
state may be detemiined before, and at various points during, treatment of the individual 
with the agent. 

In a preferred embodiment, the present invention provides a method for monitoring 
the effectiveness of treatment of a subject with an agent {e.g,, an agonist, antagonist, 
peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate 
identified by the screening assays described herein) comprising the steps of (i) obtaining a 

^'^ pre-administration sample from a subject prior to administration of the agent; (ii) detectmg 
the level of the polypeptide or nucleic acid of the invention in the preadministration sample; 
(iii) obtaining one or more post-administration samples from the subject; (iv) detecting the 
level the of the polypeptide or nucleic acid of the invention in the post-administration 
samples; (v) comparing the level of the polypeptide or nucleic acid of the invention in the 

•^^ pre-administration sample with the level of the polypeptide or nucleic acid of the invention 
in the post-administration sample or samples; and (vi) altering the administration of the 
agent to the subject accordingly. For example, increased administration of the agent may be 
desirable to increase the expression or activity of the polypeptide to higher levels than 
detected, i.e., to increase the effectiveness of the agent. Alternatively, decreased 

■^^ administration of the agent may be desirable to decrease expression or activity of the 
polypeptide to lower levels than detected, i.e., to decrease the effectiveness of the agent. 

-112- 



iBllJSDOClD: <W0 0100673A1J_> ^:>.^o.... : v^:-.. r <>■■ . . <*^^ 



C. Methods of Treatment 

The present invention provides for both prophylactic and therapeutic methods of 
treating a subject at risk of (or susceptible to) a disorder or having a disorder associated 
with aberrant expression or activity of a polypeptide of the invention, e.g,, cardiac infection 
(e,g,f myocarditis or dilated cardiomyopathy), central nervous system infection (e.g., non- 
^ specific febrile illness or meningoencephalitis), pancreatic infection (e,g,, acute 

pancreatitis), respiratory infection (pneumonia), gastrointestinal infection, type I diabetes, 
cancer, familia hypercholesterolemia, treat hemophilia B, Marfan syndrome, protein S 
deficiency, allergy, inflammation, and gastroduodenal ulcer. Moreover, the polypeptides of 
the invention can be used to modulate cellular function, survival, morphology, proliferation 
and/or differentiation. 

1. Prophvlactic Methods 

In one aspect, the invention provides a method fot preventing in a subject, a disease 
or condition associated with an aberrant expression or activity of a polypeptide of the 
invention, by administering to the subject an agent which modulates expression or at least 
one activity of the polypeptide. Subjects at risk for a disease which is caused or contributed 
to by aberrant expression or activity of a polypeptide of the invention can be identified by, 
for example, any or a combination of diagnostic or prognostic assays as described herein. 
Administration of a prophylactic agent can occur prior to the manifestation of symptoms 
characteristic of the aberrancy, such that a disease or disorder is prevented or, altematively, 
delayed in its progression. Depending on the type of aberrancy, for example, an agonist or 
antagonist agent capn be used for treating the subject. 
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2. Therapeutic Methods 

Another aspect of the invention pertains to methods of modulating expression or 
activity of a polypeptide of the invention for therapeutic purposes. The modulatory method 
of the invention involves contacting a cell with an agent that modulates one or more of the 
activities of the polypeptide. An agent that modulates activity can be an agent as described 
herein, such as a nucleic acid or a protein, a naturally-occurring cognate ligand of the 
polypeptide, a peptide, a peptidomimetic, or other small molecule. In one embodiment, the 
agent stimulates one or more of the biological activities of the polypeptide. Examples of 
such stimulatory agents include the active polypeptide of the invention and a nucleic acid 
molecule encoding the polypeptide of the invention that has been introduced into the cell 
In another embodiment, the agent inhibits one or more of the biological activities of the 
polypeptide of the invention. Examples of such inhibitory agents include antisense nucleic 
acid molecules and antibodies. These modulatory methods can be performed in vitro (e^., 
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by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the 
agent to a subject). As such, the present invention provides methods of treating an 
individual afflicted with a disease or disorder characterized by aberrant expression or 
activity of a polypeptide of the invention. In one embodiment, the method involves 
administering an agent (e,g, , an agent identified by a screening assay described herein), or 
combination of agents that modulates (e.g., upregulates or downregulates) expression or 
activity. In another embodiment, the method involves administering a polypeptide of the 
invention or a nucleic acid molecule of the invention as therapy to compensate for reduced 
or aberrant expression or activity of the polypeptide. 

Stimulation of activity is desirable in situations in which activity or expression is 
abnormally low or downregulated and/or in which increased activity is likely to have a 
beneficial effect. Conversely, inhibition of activity is desirable in situations in which 
activity or expression is abnormally high or upregulated and/or in which decreased activity 
is likely to have a beneficial effect. 

The contents of all references, patents and published patent applications cited 
throughout this application are hereby incorporated by reference. 



Deposit of Clones 

Clones containing cDNA molecules encoding human MANGO 003 were deposited 
with the American Type Cuhure Collection (ATCC® 10801 University Boulevard, 
Manassas, VA 201 10-2209) on March 30, 1999 as Accession Number 207178, as part of a 
composite deposit representing a mixture of three strains, each carrying one recombinant 
plasmid harboring a particular cDNA clone. 

To distinguish the strains and isolate a strain harboring a particular cDNA clone, an 
aliquot of the mixture can be streaked out to single colonies on nutrient medium (e.g., LB 
plates) supplemented with 100 g/ml ampicillin, single colonies grown, and then plasmid 
DNA extracted using a standard minipreparation procedure. Next, a sample of the DNA 
minipreparation can be digested with a combination of the restriction enzymes Sal I and Not 
I, and the resultant products resolved on a 0.8% agarose gel using standard DNA 
electrophoresis conditions. The digest liberates fragments as follows: 

human MANGO 003 (clone EpthLa6al): 3.2 kB 

The identity of the strains can be inferred from the fragments liberated. 

Clones containing cDNA molecules encoding human INTERCEPT 340, MANGO 
347, and TANGO 272 were deposited with the American Type Culture Collection (ATCC® 
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10801 University Boulevard, Manassas, VA 20110-2209) on June 18, 1999 as Accession 
Number PTA-250, as part of a composite deposit representing a mixture of three strains, 
each carrying one recombinant plasmid harboring a particular cDNA clone. 

To distinguish the strains and isolate a strain harboring a particular cDNA clone, an 
aliquot of the mixture can be streaked out to single colonies on nutrient medium (e.g., LB 
plates) supplemented with 100 g/ml ampicillin, single colonies grown, and then plasmid 
DNA extracted using a standard minipreparation procedure. Next, a sample of the DNA 
minipreparation can be digested with a combination of the restriction enzymes Sal I and Not 
I, and the resultant products resolved on a 0.8% agarose gel using standard DNA 
electrophoresis conditions. The digest liberates fragments as follows: 

human INTiERCEPT 340 (clone EpI34G): 3.3 kB 
human MANGO 347 (clone EpM347): L4kB 
human TANGO 272 (clone EpT272): 5.0 kB 

The identity of the strains can be inferred from the fragments liberated. 



Clones containing cDNA molecules encoding human TANGO 295, TANGO 354, 
and TANGO 378 were deposited with the American Type Culture Collection (ATCC® 
108Q1 University Boulevard, Manassas, VA 201 10-2209) on June 18, 1999 as Accession 
Number PTA-249, as part of a composite deposit representing a mixture of three strains, 
each carrying one recombinant plasmid harboring a particular cDNA clone. 

To distinguish the strains and isolate a strain harboring a particular cDNA clone, an 
aliquot of the mixture can be streaked out to single colonies on nutrient medium (e.g.y LB 
plates) supplemented with 100 g/ml ampicillin, single colonies grown, and then plasmid 
DNA extracted using a standard minipreparation procedure. Next, a sample of the DNA 
minipreparation can be digested with a combination of the restriction enzymes Sal I and Not 
I, and the resultant products resolved on a 0.8% agarose gel using standard DNA 
electrophoresis conditions. The digest liberates fragments as follows: 

human TANGO 295 (clone EpT295): 1 .5 kB 
human TANGO 354 (clone EpT354): 1.8 kB 
human TANGO 378 (clone EpT378): 3.3 kB 

The identity of the strains can be inferred from the fragments liberated. 
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All publications, patents and patent applications mentioned in this specilScation are 
herein incorporated by reference into the specification to the same extent as if each 
individual publication, patent or patent application was specifically and individually 
indicated to be incorporated herein by reference. 

E quivalents 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, many equivalents to the specific embodiments of the invention 
described herein. Such equivalents arie intended to be encompassed by the following 
Claims. 
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What is claimed is : 



1. An isolated nucleic acid molecule selected from the group consisting of: 

a) a nucleic acid molecule comprising a nucleotide sequence which is at least 
55% identical to the nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 
18, 19, 21, 22, 24, 25, 27, 28, 30, the cDNA insert of the plasmid deposited with the 
ATCC® as Accession Number 207178, the cDNA insert of the plasmid deposited with the 
ATCC® as Accession Number PTA-249, the cDNA insert of the plasmid deposited with the 
ATCC® as Accession Nimiber PTA-250, or a complement thereof; 

b) a nucleic acid molecule comprising a fragment of at least 300 nucleotides of 
the nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 
24, 25, 27, 28, 30, the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number 207178, the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number PTA-249, the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number PTA-250, or a complement thereof; 

c) a nucleic acid molecule which encodes a polypeptide comprising the amino 
acid sequence of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 26, 29, the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number 207178, the amino acid sequence encoded by the cDNA insert of the plasmid 
deposited with the ATCC® as Accession Number PTA-249, or the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number PTA-250; 

d) a nucleic acid molecule which encodes a fragment of a polypeptide 
comprising the amino acid sequence of SEQ ID N0s:2, 5, 8, 11, 14, 17, 20, 23, 26, 29, the 
amino acid sequence encoded by the cDNA insert of the plasmid deposited with the ATCC® 
as Accession Number 207178, the amino acid sequence encoded by the cDNA insert of the 
plasmid deposited with the ATCC® as Accession Number PTA-249, or the amino acid 
sequence encoded by the cDNA insert of the plasmid deposited with the ATCC® as 
Accession Number PTA-250, wherein the fragment comprises at least 15 contiguous amino 
acids of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, 29, the amino acid sequence encoded 
by the cDNA insert of the plasmid deposited with the ATCC® as Accession Number 
207178, the amino acid sequence encoded by the cDNA insert of the plasmid deposited 
with the ATCC® as Accession Number PTA-249, or the amino acid sequence encoded by 
the cDNA insert of the plasmid deposited with the ATCC® as Accession Number PTA-250; 
and 

e) a nucleic acid molecule whioh encodes a naturally occurring allelic variant of 
a polypeptide comprising the amino acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 
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23, 26, 29, the amino acid sequence encoded by the cDNA insert of the plasmid deposited 
with the ATCC® as Accession Number 207178, the amino acid sequence encoded by the 
cDNA insert of the plasmid deposited with the ATCC® as Accession Number PTA-249, or 
the amino acid sequence encoded by the cDNA insert of the plasmid deposited with the 
ATCC® as Accession Number PTA-250, wherein the nucleic acid molecule hybridizes to a 
nucleic acid molecule comprising SEQ ID NOsrl, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 
21, 22, 24, 25, 27, 28, 30, or a complement thereof, under stringent conditions. 

2. The isolated nucleic acid molecule of Claim 1, which is selected from the 
group consisting of: 

a) a nucleic acid comprising the nucleotide sequence of SEQ ID NOs: 1 , 3, 4, 6, 
7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, 30, the cDNA insert of the plasmid 
deposited with the ATCC® as Accession Number 207178, the cDNA insert of the plasmid 
deposited with the ATCC® as Accession Number PTA-249, the cDNA insert of the plasmid 
deposited with the ATCC® as Accession Number PTA-250, or a complement thereof; and 

b) a nucleic acid molecule which encodes a polypeptide comprising the amino 
acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23 , 26, 29, the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number 207178, the amino acid sequence encoded by the cDNA insert of the plasmid 
deposited with the ATCC® as Accession Number PTA-249, or the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number PTA-250. 
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3. The nucleic acid molecule of Claim 1 further comprising vector nucleic acid 
sequences. 

4. The nucleic acid molecule of Claim 1 further comprising nucleic acid 
sequences encoding a heterologous polypeptide. 

5. A host cell which contains the nucleic acid molecule of Claim 1 , 

6. The host cell of Claim 5 which is a mammalian host cell. 

7. A non-human mammalian host cell containiiig the nucleic acid molecule of 
Claim 1. 

8. An isolated polypeptide selected from the group consisting of: 
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a) a fragment of a polypeptide comprising the amino acid sequence of SEQ ID 
N0s:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, wherein the fragment comprises at least 15 
contiguous amino acids of SEQ ID N0s:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29; 

b) a naturally occurring allelic variant of a polypeptide comprising the amino 
acid sequence of SEQ ID N0s:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number 207 1 78, the amino acid sequence encoded by the cDNA insert of the plasmid 
deposited with the ATCG® as Accession Number PTA-249, or the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number PTA-250, wherein the polypeptide is encoded by a nucleic acid molecule which 
hybridizes to a nucleic acid molecule comprising SEQ ID NOs: 1, 3, 4, 6, 7, 9, 10, 12, 13, 
15, 16, 18, or a complement thereof under stringent conditions; and 

c) a polypeptide which is encoded by a nucleic acid molecule comprising a 
nucleotide sequence which is at least 55% identical to a nucleic acid comprising the 
nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 
25, 27, 28, 30, or a complement thereof* 

9. The isolated polypeptide of Claim 8 comprising.the amino acid sequence of 
SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29. 

1 0* The polypeptide of Claim 8 further comprising heterologous amino acid 
sequences. 

11. An antibody which selectively binds to a polypeptide of Claim 8. 

12. A method for producing a polypeptide selected from the group consisting of: 

a) a polypeptide comprising the amino acid sequence of SEQ ID NOs:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA insert of the 
plasmid deposited with the ATCC® as Accession Number 207178, the amino acid sequence 
encoded by the'cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number PTA-249, or the amino acid sequence encoded by the cDNA insert of the plasmid 
deposited with the ATCC® as Accession Number PTA-250; 

b) a polypeptide comprising a fragment of the amino acid sequence of SEQ ID 
N0s:2, 5, 8, 1 1 , 14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA 
insert of the plasmid deposited with the ATCC® as Accession Number 207178, the amino 
acid sequence encoded by the cDNA insert of the plasmid deposited with the ATCC® as 

or 

Accession Number PTA-249, or the amino acid sequence encoded by the cDNA insert of 
the plasmid deposited with the ATCC® as Accession Number PTA-250, wherein the 
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fragment comprises at least 15 contiguous amino acids of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 
20» 23, 26, or 29, the amino acid sequence encoded by the cDNA insert of the plasmid 
deposited with the ATCC^ as Accession Number 207178, the amino acid sequence encoded 
by the cDNA insert of the plasmid deposited with the ATCC® as Accession Number PTA- 
249, or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with 
^ the ATCC® as Accession Number PTA-250; and 

c) a naturally occurring allelic variant of a polypeptide comprising the amino 
acid sequence of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number 207178, the amino acid sequence encoded by the cDNA insert of the plasmid 
deposited with the ATCC® as Accession Number PTA-249, or the amino acid sequence 
encoded by the cDNA insert of the plasmid deposited with the ATCC® as Accession 
Number PTA-250, wherein the polypeptide is encoded by a nucleic acid molecule which 
hybridizes to a nucleic acid molecule comprising SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 
15, 16, 18, 19, 21, 22, 24^ 25, 27, 28, 30, or a complement thereof under stringent 
conditions; 

comprising culturing the host cell of Claim 5 under conditions in which the nucleic 
acid molecule is expressed. 
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13. A method for detecting the presence of a polypeptide of Claim 8 in a sample, 
comprising: 

a) contacting the sample with a compound which selectively binds to a 
polypeptide o f Claim 8 ; and 

b) determining whether the compound binds to the polypeptide in the sample. 

14. The method of Claim 13, wherein the compound which binds to the 
polypeptide is an antibody. 

15. A kit comprising a compound which selectively binds to a polypeptide of 
Claim 8 and instructions for use. 



30 



16. A method for detecting the presence of a nucleic acid molecule of Claim 1 in 
a sample, comprising the steps of: 

a) contacting the sample with a nucleic acid probe or primer which selectively 
hybridizes to the nucleic acid molecule; and 

b) determining whether the nucleic acid probe or primer binds to a nucleic acid 
molecule in the sample. 
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1 7. The method of Claim 1 6, wherein the sample comprises mRNA molecules 
and is contacted with a nucleic acid probe. 

18. A kit comprising a compound which selectively hybridizes to a nucleic acid 
molecule of Claim 1 and instructions for use. 

19. A method for identifying a compound which binds to a polypeptide of Claim 
8 comprising the steps of: 

a) contacting a polypeptide, or a cell expressing a polypeptide of Claim 8 with 
a test compound; and 

b) determining whether the polypeptide binds to the test compound. 

20. The method of Claim 1 9, wherein the binding of the test compound to the 
polypeptide is detected by a method selected from the group consisting of: 

a) detection of binding by direct detecting of test compound/polypeptide 
binding; 

b) detection of binding using a competition binding assay; 

c) detection of binding using an assay for INTERCEPT 340-, MANGO 003-, 
MANGO 347-, TANGO 272-, TANGO 295-, TANGO 354-, or TANGO 378-mediated 
signal transduction. 

21 . A method for modulating the activity of a polypeptide of Claim 8 
comprising contacting a polypeptide or a cell expressing a polypeptide of Claim 8 with a 
compound which binds to the polypeptide in a sufficient concentration to modulate the 
activity of the polypeptide. 

22. A method for identifying a compound which modulates the activity of a 
polypeptide of Claim 8, comprising: 

a) contacting a polypeptide of Claim 8 with a test compound; and 

b) determining the effect of the test compound on the activity of the 
polypeptide to thereby identify a compound which modulates the activity of the 
polypeptide. 
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OTCGACCCACGCGTCCGTTATGTAACTATACATTTTCCCAGAAAT^ 79- 



CCTTTTCCC AAGC AGTOTATTATGAAAATTTTCAAAC^^ 158 

C ATTACCTAAATTTTACC ATTAACATTTTACCCTGCTGGC ATTATTC 237 

CATTGGTGTATTTCTAAGTAAATTGTAGGCCTCAGTACACTTCC^ 316 

TCCATTTTTAAAAGAGCAATTCTTGATAGATTTATATAGITT^ 395 

TTGTTTCTTAriXSTATGTCTAGGGTCCTGAAGGGGAT^ 474 

CACAGAGGAAACACTGGTCCCCTTGGCAGAQAAGGTATAATAGGCCCAACAGGTAGAACT^ 553 

GCTOTAGAOGTGAiACTGGTC 632 

AAAGCAAAIH^GATATCAATGCTGCTATTCAAGCCT^ATTGAATC 711 

GTTTTAirrTATATTGGCACTGTCTC 790 

TCAAAGATTGTATTTAAAACAGATTGAAAAIXSTGAAACCATTCT^^ 86 9 

AGAAATATATGCGTAGGATOTTTTGTAAGGAAAACATra^ 948 

ACTATCau^GAAAGTAGTTAAATGAGGTTAGCCATGTTTCTTAAAAT^ 1027 

AAACTCTAATGATTCAATGTGTAATTTAAAAAACATAATAO^GTAGACATAG^ 1106 

ACrrGCAAATGTGAArrTAACCTCTTTAAAAGACT 11 6 5 

M E T H S S P A L A 10 
TGTTCTTTACATTCTACTCACfUVCTTACTACACATA ATG GAA ACA CAT TCT TCT CCT GCC TTG GCC 1251 

HVQPQ DFFV YIILM MTWQS Y 30 
CAT GTT GGT CCT CAG GAT TTT TTT GTT TAT ATA ATT CTT ATG ATG ACT TGG CAG AGC TAG 1311 

QNTEVTLID. HSE E IPKTLNY 50 
CAG AAT ACT GAA GTG ACT TTA ATT GAC CAC AGT GAA GAG ATA TTC AAA ACC CTG AAC TAC 1371 

LSNLLHStKN .PLGT RDNPA R 70 
CTT AGC AAT TTA TTG CAC AGC ATC AAG AAT CCT CTT.GGC ACA CGA GAT AAC CCA GCA CGA 1431 

I C K D . . L L N C E Q K V S D G K Y W I D 90 
ATC TGC AAA GAT TTA CTT AAC TGT GAA CAA AAA GTA TCA GAT GGA AAA TAC TGG ATT GAC 1491 

P N L G C P S D A I E V P C N P S A G G 110' 
CCA AAT CTT GGC TGT CCT TCA GAT GCC ATT GAG GTT TTC TGC AAT TTC AGT GCT GGT GGC 1551 

QT CLPPVSVT K LEFGVG KVQ 130 
-CTSr ACA-TK3C TTA CCT-^2CT- GTT TCT GTA ACA AAG TTG GAG TTT GGA GTT GGG AAA GTC CAG 1611 

MN PLHL LSS EATHllT I HCli 150 
-ATG-AAC TTC. CTT CAT TTA C TG AGT TCG GAA GCC ACC CAT ATC ATC A CC ATT CA C TGT CTA 1671 



Figure lA 

1/8S 



BNSDOCID: <WO_0100673A1J_; 



N T P R W T S T Q T S G P G L P I G F K 170 

AAC ACC CCA AGG TGG ACA AGO ACA CAA ACA AGT GGC CCA GGA TTG CCT ATT GGT TTC AAG 1731 

GWNGQIPKVNTLLEPKV LSD190 

GGA TGG AAT GGC CAG ATT TTT AAA GTA AAC ACT CTA GTT 6AA CCT AAA GTG CTT TCA GAT 1791 

D C K I Q D G S W H K A T P L F H T Q E 210 

GAC TGC AAG ATT CAA GAT GGC AGC TGG CAT AAG GCA ACA TTT CTT TTT CAC ACC CAG GAA 1851 

.-.p Q L- P -V • I £ V -Q— K LPH-LKTE RK 230 

CCT AAT CAA CTT. CCA GTG ATT GAA GTA CAA AAA CTT. CCT CAT CTC- AAA .ACT.JSAA. CGA AAG 1911,. 

YYIDSS SV C FL* 242 
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ACG GGC CCT CAC TGT GCT AGT CTT TGT CCT GCT GAC ACC TAC GGT GTC AAC TGT TCT GCA 1615 

RCSCEMAXACSPXD GfiC VCK 482 

CGC TGC TCA TGT GAA AAT GCC ATC GCC TGC TCA CCC ATC GAC GGC GAG TGC GTC TGC AAG 1675 

E G W Q R G N "c S V P C PPG T W G F S 502 

GAA GGT TGG CAG CGT GGT AAC TGC TCT GTG CCC TGC CCA CCC GGA ACC TGG GGC TTC AGT 1735 

CNA SCQCAHEAVCSPQTGAC 522 

TGC AAT GCC AGC TGC CAG TGT GCC CAT GAG GCA GTC TGC AGC CCC CAA ACT GGA GCC TGT 1795 

TC TPGW HGAHCQLPCPKGQP 542 

ACC TGC ACC CCT GGG TGG CAT GGG GCC CAC TGC CAG CTG CCC TGT CCG AAG GGG CAG TTT 1855 

GEOCA SRCDCDH SDGCDPVH 562 

GGA GAA GGT TGT GCC AGT CGC TQT GAC TGT GAC CAC TCT GAT GGC TGT GAC CCT GTT CAT 1915 

GRCQCQAGW MGARCHLSC PE 582 

GGA CGC TGT CAG TGC CAG GCT GGC TGG ATG GGT GCC CGC TGC CAC CTG TCC TGC CCT GAG 1975 

Q L W G V N C S. N T C T C K N G G T C L 602 

GGC TTA TGG-GGA-QTC AAC TGT AGC AAC ACC TGC ACC TGC AAG AAT GGG GGC ACC TGT CTC 2035 

P E N G N C V C A P G P R G P S C Q R S 622 

CCT GAG AAT GGC AAC TGC GTG TGT GCA CCC GGA TTC CGG GGC CCC TCC TGC CAG AGA TCC 2095 

CQPG RYG KRCV PCKC AM HSP 642 

TGT CAG CCT GGC CGC TAT GGC AAA CGC TGT GTG CCC TGC AAG TGC GCT AAC CAC TCC TTC 21S5 

CHPSNGTC YCL A QW TGPOCS 662 

TGC CAC CCC TCG AAC GGG ACC TGC TAC TOO CTG GCT GGC TGG ACA GGC CCC G»C TGC TCC 2215 

QPCPPGHWGENCA Q T C Q C H H 682 

CAG CCA TGC CCT CCA GGA CAC TGG GGA GAA AAC TGT GCC CAG ACC TGC CAA TGT CAC CAT 2275 

0 G T C H P 0 DGSCICPLGW T G H 702 
GGT GGG ACC TGC CAT CCC CAG GAT GGG AGC TGT ATC TGC CCC CTA GGC TGG ACT GGA CAC 2335 
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HCLEGCP LGTF GANCSQ PCQ 722 

CAC TGC TTA GAA GGC TGC CCT CTG GGG ACA TTT GGT GCT AAC TGC TCC CAG CCA TGC CAG 2395 

CG PGEKCHPETG ACVCPPGH 742 

TGT GGT CCT GGA GAA AAG TGC CAC CCA GAG ACT GGG GCC TGT GTA TGT CCC CCA GGG CAC 24S5 

SGAPCRIGIQEPPT V M P T T P 762 

AGT GGT GCA CCT TGC AGG ATT GGA ATC CAG GAG CCC TTT ACT GTG ATG CCG ACC ACT CCA 2515 

VA YNSLGA VIGIAVL GSLVV 782 

GTA GCG TAT AAC TCG CTG GGT GCA GTG ATT GGC ATT GCA GTG CTG GGG TCC CTT GTG GTA 2575 

A L V A L F IGYRHWQK GKEHH H 802 

GCC CTG GTG GCA CTG TTC ATT GGC TAT CGG CAC TGG CAA AAA GGC AAG GAG CAC CAC CAC 2635 

li A V A Y S S G R L D.G S^E Y V M P D V 822 

CTG GCT GTG GCT TAC AGC AGC GGG CGC CTG GAC GGC TCC GAG TAT GTC ATG CCA GAT OTC 2695 

P J? S Y S H Y Y S N P S Y K T L S Q C S 842 

CCT CCG AGC TAC AGT CAC. TAC TAC TCC AAC CCC AGC TAC CAC ACC CTG TCG CAG TGC TCC 2755 

PN P P P P N K V P G PL P A S IiQN P 862 

CCA AAO GGG GCA CCC CCT AAC AAG GTT CCA- GGC. CCG CTC. TTT GCC AGC CTG CAG AAC CCT 2815 

E RPGGAQGH DNHT T LPADWK 882 

GAG CGG CCA GGT GGG GCC CAA GGG CAT GAT AAC CAC ACC ACC CTG CCT GCT GAC TGG AAG 2875 

H R RB P P P G P li D RG S S R L D R S 902 

CAC CGC CGG GAG CCC CCT CCA GGG CCT CTG GAC AGG GGG AGC AGC CGC CTG GAiC CGA AGC 2935 

YSYS.Y SNG P GP PYDK GLISE 922 

TAC AGC TAT AGC TAC AGC AAT GGC CCA GGC CCA TTC TAC GAT AAA GGG CTC ATC TOT GAA 2995 

EELG ASVASXiSSENPYATiR 942 

GAG GAG CTC GGG GCC AGT GTG GCT TCC CTG AGC AGT GAG AAC CCA TAT GCC ACC ATC CGG 3055 

O L P S L P G G P R E S S Y M E M K G P 962 

GAC CTG CCC AGC TTG CCA GGG GGC CCC CGG GAG AGC AGC TAC ATG GAG ATG AAA GGC CCT 3115 

P S G S — A PRQPPOFW DSQRRRQ 982 

CCC TCA GGA TCT GCC CCC AGG CAG CCT CCT CAG TTT TGG GAC AGC CAG AGG CGG CGG CAA 3175 

P Q P QRD S G T Y E Q P S P L I H D R 1002 

CCC CAG CCA CAG AGA GAC AGT GGC ACC TAC GAG CAG CCC AGC CCC CTG ATC .CAT QAC CGA 3235 

DSV GSQPPLPP GLPPQH YDS 1022 

GAC TCT GTG GGC TCC CAG CCC CCT CTG CCT CCG GGC CTA CGC CCC GGC CAC TAT GAC TCA 3295 

PK KSH I P GHYDL P PVRHP PS 1042 

CCC-AAG AAC AGC CAC ATC CCT GGA CAT TAT GAC TTG CCT CCA GTA-C6G-GA5P-GGC CCA TCA 3355 

P PL RRQDR* 1051 

CCT CCA CTT CGA CGC CAG GAC CGT TGA 3382 

GGAGCCAGGATGGTATGGCAGAGGCCAGCACACCTGGCTGTTGCTGCTCAAGGCTGGGGACAGAGCCTAG^ 3461 



Figure 13C 

18/85 



BNSDOCID; <wa-i_010067aA1JL> 



GCCAGGAGCAGGGAGTC«3ACCGGCAGGCTGTGAACATGAACAACGCTTAACAGAGCAAGTGATGGGA0CCT^ 3540 

GGTTCTACCATGGGAGACGCTGA1K:AGCAGGATGCCTGGCTCCCTTTCCC AAC 3619 

CTGTGTACATAAACTGGTGGGTTGGAAGTTGCTGGGTAACTCTGATT^^ 3698 

ATGCTCAGCCTGGGCTCTGTGCGTGTGTGTGTTTCTGT^ 3777 

TACCATTTAGTAGGGAGATGGAACCAACCCAATTAACTCTAGCAATAGCCTCOT 3 8 S 6 

GAACCrrcCAATGCATGGCTCATAATTTCAAAATACAGG^ 3935 

TCTTTGCTCTTCTGCCy^GTATCAAAACT^^ 4014 

TCACCTTGAACTGTGTTCCTGTCACTGCACGCCAGTCA^^ 4093 

GCACAGGGACCTGCACAa:TGGAGTGCCCTTCCTCCCCCACTCGCCT^^ 4172 

TCAGGGAAGOXKICCACCCTCCGTACATCTTTCAC^ 4251 

CCTACAGGG1X3CCAGGCACTTCTTTAATGGGTTCTTTC 4330 

CHOTAAGCTCCCTGAAGGCAAGAATCCTGT^ 4409 

AATGCTCCXrCTUUU^GGCTGAGTCGC 4488 

TGTATGCTCTGACAGTTAa^GACTGWVTAAGT^^ 4567 

GCTCAGGTOTGGGAAGGaXSCCAGGGGCAGa^ 4646 
AACAGGAGAGAGTATAC^GGCATGCCrTGATTTAT^^ 4725 
GGGACATATATGTGA^GCAATAGGTTXAGAAAAGCAAAGCAGAGAAAO^^ 4804 
AATCTGTTOQCACaVTTTTTCCAATAGCA 4883 
TTCAAGCATTTTCATTGTTATTATAlXyr^^ 4962 
GGGGCGCCATQAACCGCACCCATATAACACGGTAAACTTAAT^ 5036 
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>Capnn . . pCsngGtCvntpggssdnf ggytCeCppGdyylsytGkrC 
C p+-f + C + G+Cv +C+C pG + G++C 
CVPLCaqECVH-GRCVAPN QCQCVPG WRGDDC 181 



*->CapnnpCsngGtCvntpgg8ednfflgytCeCppGdyylsyt0krC<» 
+ C++ + c + g c+Cp tO+ C 

200 CQFRCQCHG-APCDPQTG ACFCPAfi RTCPSC 229 



*->CapnnpCfingGtCviitpggssdnfggytCeCppGdyylfiyl:GkrC<- 
C+++ pC+nga+ + g +C CppG + G C 
242 CPSTHPCQMGGVFCrrPQG-^ SCSCPPG ^WlfeTIC 272 

* 



*->CapnnpCsngGtCvntpggfisdnfggytCeCppGdyyleytGkrC<- 
C++++ C+ngG.C g +C+C+pG ytG+rC 
285 CSQBClU:HNGGLCDRFTO'---.-'----QC:mavPG '-YTGDRC 315 



• ->CapnnpCsngGtC^mtpggssdnf ggytCeCppGdyylflytG3crC<- 
Ca+++ C +++C +0 C C +G +tG+rC 
328. CAETCDCAPDARCPPANG ACLCEHG FTODRC 358 



* ->CapnnpCsngGtCvntpggssdiif ggytCeCppGdyylsytGkrC<- 

fit^.± 1 . ., C^-fr — g ..+C C pG +-^G +C 

378 CDRE HSLSCHPMNG- ECSCLPG -MAGLHC 404 
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* >CapnnpCsngGtCvntpggssdnf ggytCeCppGdyylsy tGkrC< - 
* * C++++ C+'fgG+C+ t g C^-C+pG ytG++C 
417 CQEHCLCLHGQVCQATSG LCQCAPG YTGPHC 447 



* ->CapnnpCsngGtCvntpggfisdnf ggytCeCppGdyylsytGkrC<- 
C+ + C n C + g +C+C+'t-G ++ i-C 
460 CSARCSCEKAIACSPliXS ECVCKEG WQRONC 490 



* - >CapimpCsngG t Cvn tpggs sdnf ggytCeCppGdyyl ey tGkrC< - 
C+ + C -f ++C + g C+C+pG ++G +C 
503 CNASCQCAHEAVCSPQTG ACTCTPG WHGAHC 533 



*->WstdJchiggrtslGfnleyrirvtCdenyYGegCnkJt:rPrdDafgH 
+t + + ^+ ++ C + ^K;egC+ C+ H 
518 --QTGACTCTPG WHGAHCQLPCPKGQPGEGCASRCDCD H 554 

yt . Cd * enGnklCleGWkG€yC<** 
4- +Cd+ +G+ +C -HSW-KS C 
555 SDgCOpVHGRCQCQAGWMGARC 576 



*->CapimpCsngGtCvntpggssdnfggytCeCppGdyylsytGkrC<- 
Ca+ + C++ C +++g +C+C+ G + G rC 
546 CASRCDCDHSDGCDPVHG RCQCOAG ^WMGARC 576 



.*^:iCapnhpCfingGtCvntpggssdnfggytCeCt>pGdyylfiyt^^ 
C+ ♦+ C+rigGtC++ a C4C4'pG + G+ C 
589 CSNTQTCKMGGTCLPENQ NCVCAPG-— — PRGPSC 619 



♦ ->CapnnpCsngGtCvntpggssdnf ggytCeCppGdyylsytGkrC<- 
C p C n+ +C+++ g tC C G +t:G++C 
632 - CVPC-KCANHSFCHPSNG TCYCIAG WCOPDG ^61- 
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♦->CapimpCengGtCvntpggssdn£ggytCeCppGdyylsytGkrC<- 
Ca+++ C++gGtC++ g -KI+Cp G +t6++C 
674 CAQTCQCKHGGTCHPQDG SCICPLG-- WTGHHC 704 



*->CapnnpC6ngGt(:^mtpggssdn£ggytCeCp^pGdyylsytGkrC<- 
0++++ C g g C+CppG ♦ 4<5 C 
717 CSQrCQCGPGEKCHPETG ACVCPPG HSGAPC 747 
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ST HASGDP VHGQCRCQAGW 



19 



G TCG ACC CAC GCG TCC GGT GAC CCT GTT CAT GGA CAG TGC CGA TGT CAG GCT GGT TGG 58 

MGTRC MLPC PEGFWGANCS M 39 

ATG GGC ACA CGC TGC CAC CTQ CCT TGC CCG GAG GGC TTT TGG GGA GCC AAC TGC ACT AAC 118 

T C T C K N G G T C V S E N G N C V C A 59 

ACC TGT ACC TGC AAG AAT GGT GGT ACC TGT GTG TCT GAG AAT GGC AAC TGC GTG TGC GCA 178 

P G F R G P S C Q R P C P P G R Y G K R 79 

CCA GGG TTC CGA GGC CCC TCC TGC CAG AGG CCC TGC CCG CCT GGT CGC TAT GGC AAA CGC 238 

CVQCKCNNN H SSCHPSD GTC 99 

TGT GTG CAA TGC AAG TGT AAC AAC AAC CAT TCT TCC TGC CAC CCA TCG GAC GGG ACC TGC 298 

S C L A G W T G P D C S E A C P P G H W 119 

TCC TGC CTG GCG GGC TGG ACA GGC CCT GAC TGC TCC GAG GCA TGT CCC CCA GGC CAC TGG 358 

GLKC SQLC QCHHGGT CHPOD 139 

GGA CTC AAA TGC TCC CAA CTC TGC CAG TGT CAT CAT GGT GGG ACC TGC CAC CCC CAG GAT 418 

GSCICTPG WTG PN CLBGCPP 159 

GGG AGC TGT ATC TGC ACG CCA-GGC TGG ACT GGA CCC- AAC- TGC TTG GAA GGC TGC CCA CCA 478 

R M F G V N C S Q L C Q C D L G E M C* H 179 

AGA ATG TTT GGT GTC AAC TGC TCC CAG CTA TGT CAG TGT GAT CTC GGA GAG ATG TGC CAC 538 

PET GACVCPPGH SGADC KMG 199 

CCA GAG ACT GGG GCT TGT GTC TGT CCC CCA GGA CAC AGT GGT GCA GAC TGC AAA ATG GGA S98 

SQES F TI MPT SPVTHNSLGA 219 

AGC CAG GAG TCC TTC ACC ATA ATG CCC ACC TCT CCC GTG ACC CAT AAC TCA CTG GGT GCA 658 

VIGIAVLG TLVVA LIALPIG 239 

GTG ATT GGC ATT GCA GTA CTG GGA ACC CTC GTG GTG GCC CTG ATA GCA CTG TTC ATT GGC 718 

YRQWQ K G K EH E HIjAVAYS T G 259 

TAC CGC CAG TGG CAA" AAG GGC AAG GAA CAT GAG CAC TTG GCA GTG GCT TAC AGC ACT GGG 778 

R L D G '-S- D Y V M P D V S P S Y S H Y Y 279 

CGG CTG GAT GGC TCT GAT TAC GTC ATG CCA GAT GTC TCT CCG AGC TAT AGT CAC TAC TAC 838 

S N P S Y H T li S 0 C S P N P P P P N K 299 

TCC AAC CCC AGC TAC CAC ACA CTG TCT CAG TGT TCT CCT AAC CCC CCG CCC CCT AAC AAG 898 

V PGS QL F VSS Q AP E RP' SR A H 319 

ore CCA GGC AGT CAG CTC TTT GTC AGC TCT CAG GCC CCT GAG CGG CCA AGC AGA GCC CAC 958 

G R E M H T T L P A D WKHRR EPHD 339 

t3GG-CGT GAG AAC CAT ACC ACA CTG CCC GCT GAC TGG AAG CAC CGC-CGG-GAG-^eC CAT GAC 1018 

R GAS HLD RSY S CS Y S HRMG P 359 

AGA GGC GCC AGC CAC CTG GAC CGA AGC" TAT AGC TGT AGC TAT AGC CAC AGG AAT GGC CCA 1078 
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AGAGCCGGCATGGTATGGGAGCGTGCCTATGTACCTTGCCAGGAGCAGGGACT^ 1574 

CTIHSGTGAAGTGAACAGAGACGGACTGTGGCCCTGTGC^ 1653 

CTTTTCCAACCCACTGCTCAAGTCCCTGTGGACATAAGCTGGTGGGa 1732 

GATTTTTTTTTAJUVGTATGTGTTGGGTACCTTTTCTGTGT^^ 1811 

AGAGGGAGTCAGGTATAGGTTCTGCCTTCTGCACTTTCC^ 1890 

GCTCCACCAGCAGCAGGCCCTAACTACCTGCCTGCCCTTCACCCAGTAAl^^ 1969 

CCCGACTCTGGTGTTGTCCTCCTGGTACGCCTTGACGGTCCTGCAGTC^ 2048 

GAATGAAGGCTGTCTGCCACCCTACTTCCCAGCCCAGGAATTGOCACATCTAAGTT^ 2127 

ACAGAAGGCAGAAGTGGTACCAGGCAAGAAGATGGGATTGTTGCATTT T GTT^ 2285 

TAGTCCiXKSCTGGCCTGGAACTCAAGAGCTCTGCCTGCC^ 2364 

TGCACAOCTCAAGCTGCACrrCCGATGTGCTTTC^ 2443 

TCGCCAGCTCTCTGATGCAGGACTCTGGTGTTTAGGCTCACTCACTATTGGW 2522 

AAATGTTCCTCTAAAAGCTGAAAAAAAAAAAAAAAAAGGGCGGCCGC 2569 
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GTCGACCCACGCGTCCGGCTCCCAGCCCACGCCCAAAC AGACACAGCGTAGCCCGG<X:CAGCTCTTAAGGA^ 7 9 

GTGAGAAGAGGCCCTCAGAGATCTGACAGCCTAGGAGTGCGTGGACACCACCT^ 158 

M A P A R 5 

CGAAGACCAAGCGCAAAGCGAGCCCTGCCCTCCATCCTGACTGCTCCTC^ GCA CCG GCC AGA 231 

AGFC Ptf . LLLLLLGliWVAEIP 25 

GCA GGA TTC TGC CCC CTT CTG CTG CTT CTG CTG CTG GGG CTO TGG GTG GCA GAG ATC CCA 291 

VS A.KPK GMTSSQWFKIQHMQ 45 

GTC AGT GCC AAG CCC AAG GGC ATG ACC TCA TCA CAG TGG TTT AAA ATT CAG CAC ATG CAG 351 

PSPQ ACNSAMKNINKHTKR C 65 

CCC AGC CCT CAA GCA TGC AAC TCA GCC ATG AAA AAC ATT AAC AAG CAC ACA AAA CGG TGC 411 

KDLN TFLH EPFSSVAATCQT 85 

AAA GAC CTC AAC ACC TTC CTG CAC GAG CCT TTC TCC AGT GTG GCC GCC ACC TGC CAG ACC 471 

PKIAC'KNG DKNC HQSHG PVS 105 

CCC AAA ATA GCC TGC AAG AAT GGC GAT AAA AAC TGC CAC CAG AGC CAC GGG CCC GTG TCC 531 

tiTMCK IiTSG KYPN CRYKE K R 125 

CTG ACC ATG TGT AAG CTC ACC TCA GGG AAG TAT CCG AAC TOd AGG TAC AAA GAG AAG CGA 591 

Q N" K S V V V A C K P P Q.K K D S Q Q F 145 

. CAG AAC AAG TCT TAC GTA GTG GCC TGT AAG CCT CCC CAG AAA AAG GAC TCT CAG CAA TTC 651 



H L V P V H L D .R V L * 
CAC CTG GTT CCT GTA CAC TTG GAC AGA GTC ClD© TAG 



157 
687 

766 



TTCTCTTCCCCTCATCTCTTGGGGCTGTTCCTGGTTCAGCCTCT^ 845 

GCTGTAGAGGGATGQCmTCATCTTT^^ 924 

GTGGGTTCCCTGGICTATGC^ 3-003 

AATGQAAAGGGGGCATATGGGArr^^ 1082 

GCAGTGAGGTGACCTGAAGGWA6AAAAATATAAATAAA ^^^^ 

ACATA<SACTTOACAGGGATTGTAT0CCTTC1^ ^240. 

CTTQTCTAATTAGTTAGTAGCAGAI^CTGGACTTGA^ 3.3 19 

CCAGTTGCGCAAGAAAGAAGTCACTGTTAb^GGCAAGCGGTGAACl^^ 3-398 

CTCTQAAGAGCCAGTTACCCTGTGTTGGCTGCAATAAAGGTCATTACCTCT^^ 3.477 



AAAAAAAAAAAAAAAAAAAA 



1497 
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* ->qesrAqkFlrQHiDspkt s s snpnYCNqMMdkrRnmtqgrCKpvNTF 
+ ++ q+P++QH+ ++e + CN +M k+4-n rCK+ NTP 
32 GMTSSQWFKIQHH-<-QPSPQA CNSAK-KtaMKHTKROCDLKTF 71 

vHesladvncaVCcqkNvtCJcNGQkNCyqiskss 

+He++++V a C +♦ + CkNG kNC+QS+ C+lt+g yPnC 

72 LHEPFSSVAATCQTPKIACKNGDKNCHQSHGPVSIiTMCKLTSGK— YPNC 119 ■ 

rYrtsas tkhllVACEgrd . rddPyynPyvPVHFDaev< - ♦ 
rY+ + ++k ++VAC +++4+d+ ++ VPVH+D++ 
120 RYKEKRQNKSYWACRPPQkKDSQQFH-LVPVHLDRVL 156 
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M P L L 4 

GTCGACCCACGGCGTCCGGCCAGGCTCCACTGAGGGGAACGGGGACCTGTCTGAAGAGAAG ATG CCC CTG CTG 73 



TLYLL LPWLSGYSIATQI T G 24 

ACA CTC TAG CTG CTC CTC TTC TGG CTC TCA GGC TAC TCC ATT GCC ACT CAA ATC ACC GGT 133 

PT TVN G LERQS LTVQCVY R S 44 

CCA ACA ACA GTG AAT GGC TTG GAG CGG GGC TCC TTG ACC GTG CAG TGT GTT TAC AGA TCA 193 

G WETYIiKWWCRGAIW RDCKI 64 

GGC TGG GAG ACC TAC TTG AAG TGG TGG TGT CGA GGA GCT ATT TGG CGT GAC TGC AAG ATC 253 

L V K T S 0 S B Q E V K R D R V S I K D 84 

CTT GTT AAA ACC AGT GGG TCA GAG CAG GAG GTG AAG AGO GAC CGG GTG TCC ATC AAG GAC 313 

N0K N RTPTVTME DLMKTDAD104 

AAT CAG AAA AAC CGC ACQ TTC ACT GTG ACC ATG GAG GAT CTC ATG AAA ACT GAT GCT GAC 373 

T Y WC GI EKT GND LG VTVQVT 124 

ACT TAC TGG TGT GGA ATT GAG AAA ACT GGA AAT GAC CTT GGG GTC ACA GTT CAA GTG ACC 433 

IDPA STPAPTT PT STTPT A P 144 

ATT GAC CCA GCG TCG ACT CCT GCC CCC ACC ACG CCT ACT. TCC ACT ACG TTT ACA GCA CCA 493 

VT QEETSSSPTLTG HHLDNR 164 

CTC ACC CAA GAA GAA ACT AGC AGC TCC CCA ACT CTG ACC GGC CAC CAC TTG GAC AAC AGG 553 

HKL LKL 'SVLLPL IPTILLL L 184 

CAC AAG CTC CTG AAG CTC AGT GTC CTC CTG CCC CTC ATC TTC ACC ATA TTG CTG CTG CTT 613 

L V A A S L L A W R M M K Y Q Q K A A G 204 

'l-rc GTG GCC GCC TCA CTC TTG GCT TGG AGG ATG ATG AAG TAC CAG CAG AAA GCA GCC GGG 673 

MSPEQVLQ PLE GD LCYADLT 224 

ATG TCC CCA GAG CAG GTA CTG CAG CCC CTG GXG GGC GAC CTC TGC TAT GCA GAC CTG ACC 733 

L Q li A G T -.-S P R K A X 35 K L S S .. A Q U 24A 

CTG CAG CTG GCC GGA ACC TCC CCG CGA AAG GCT ACC ACG AAG CTT TCC TCT GCC CAG GTT 793 

D Q V E V EYV. TMA S LP KEDI S Y 264 

CAC CAG GTG GAA GTG GAA TAT GTC ACC ATG GCT TCC TTG CCG AAG GAG GAC ATT TCC TAT 8S3 

AS LTLGAEDOE PTYCNMGH L 284 

GCA TCT CTG ACC TTG GGT GCT GAG GAT CAG GAA CCG ACC TAC TGC AAC ATG GGC CAC CTC 913 

SSHIiPGRGPEEPTEYSTlSR 304 

AGT AGC CAC CTC CCC GGC AGG GGC CCT GAG GAG CCC ACG GAA TAC AGC ACC ATC AGC AGG 973 

p ♦ 306 

-"CCT^TAGT 97 9 

CCTGCACTCCAGGCTCCTTCTTGGACCCCAGGCTCTGAGCAC ACTCCTGCCTCATC 10 58 

C ATCAGGACCAACCCGGGGACTGGTGCCTCTGCCTGATCAGCC AGCATTGCCCCTAGCTCTGGGTTGG^ 1 1' 3 7 



Figure 21 A 

30/85 



8NSD0CID: <W0 ^0100673A1J_> 



AGTCTCAGGGGGCTTCTAGGAGITGGGG'mnVTAAACGTCCCCTCCTCTCCTACATAGT^ 1216 

ATGCTCTCGGGCTTTCATGGG AA1*G A'lX; AAGATG ATAATGAG AAAAATC 1295 

ATAATACAATGAACCTTTATTTATTGCC1* ACC AC ATCrrATGGGCTGAATAAT^ 1374 

TCCTO^GAACTTGTGACTGTTACCTTCTGTGGC AG AAAGGGACAGTCCAGATGT^ 1453 

AGAGGTTATTCTTGCTGArrCAGGTGGGCCCAA7U\TATCACCACAAGGGT^ 1532 

GAGGTAGAGACAAAGTGATGATGGAAGTGGACGTGGG1X3TG ACGTGAGCAGGGGCCATGAATGCCGCAGC 1611 

CCyiGAAAGGGAAAGGAATGGATTCCCCTGCCTGGAGCCTCC AAAAGAAACCAGCCCTGCCCACGCCT^ 1690 

ATTCAAACTGATCTTGAGCTCCTGGCCTCC AGAAWGCAGG AGAATAAATT^ 1769 

AAAAAGSiSCCjGCCGCTAGA 1788 
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, . I . , t , I i < I < I . I . I « j 1 1 1 1 1 1 1 p I < t 

1 41 81 121 161 201 241 281 
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♦ ->GesvtLtCsvsgf gppgvsvtwyf kngk . Igpsllgysysr 1 

33 R?3SLTVQCnmi-rSGWETXJ.KWWCrffai^ 75 

esgekanlsegrf sic BltLtissvekeDsGtYtCw<-* 

++ r+si ++++++++t+t+* ++ 'k D+ tY4C 

76 KRD-— -RVSIKdnQknrTIT^VlMEDIJaTOADT^ 110 



Figure 23 
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MDHCGALPL 9 

CACGCGTCCGGCCAGTTCTTGGAGGAGACTCTGCACAGG6C ATG GAT CAC TGT GGT GCC CTT TTC CTG 68 

C LCLLTLQNAT. TETWEELLS 29 

TGC CTG TGC CTT CTG ACT TTG CAG AAT GCA ACA ACA GAG ACA TGG GAA GAA CTC CTG AGC 128 

Y MENM QVSR QRSSVPSSRQL 49 
TAC ATG GAG AAT ATO CAG-GTG TCC AGG GGC CGG "AGC TCA GTT TTT TCC TCT CGT CAA CTC 188 

KQIiEQMLLMTSFPGYNLTLQ 69 

CAC CAG CTG GAG CAO ATG CTA CTG AAC ACC AGC TTC CCA GGC TAC AAC CTG ACC TTG CAG 248 

™ p J- g - g- 2 A "F " K'""L S C D F S G " V S h 89 

ACA CCC ACC ATC CAG TCT CTG GCC TTC AAG CTG AGC TGT GAC TTC TCT GGC CTC TCG CTG 308 

T S A T L K R V P Q A 0 G Q H A R G 0 H 109 

ACC AGT GCC ACT CTG AAG CGQ GTG CCC CAG GCA GGA GGT CAG CAT GCC CGG GGT CAG CAC 368 

AM QF PAELTR DACK TRPREL 12S 

GCC ATG CAG TTC CCC GCC GAG CTG ACC CGG GAC GCC TGC AAG ACC CGC CCC AGG GAG CTG 428 

RtilClYFSHTHP FKOENNSS 149 

CGG CTC ATC TGT ATC TAC TTC TCC AAC ACC CAC TTT TTC AAG GAT GAA AAC AAC TCA TCT 488 

LIiMNYVLGAQtiS H.GHVN MLR 169 

CTG CTG AAT AAC TAC GTC CTG GGG GCC CAG CTG AGT CAT GGG CAC GTG AAC AAC CTC AGG 548 

DPVMISFWHMQSLEGVTLTC 189 

GAT CCT GTG AAC ATC AGC TTC TGG CAC AAC CAA AGC CTG GAA GGC TAC ACC CTG ACC TGT 608 

V F W K E G A R K Q P W G G W S P E G C 209 
GTC TTC TGG AAG GAG GGA GCC AGG AAA CAG CCC TGG GGG GGC TGG AGC CCT GAG GGC TGT 668 

RTEQPSHS O V LCR CKHLTY F 229 

CGT ACA GAG CAO CCC TCC CAC TCT CAG GTG CTC TGC CGC TGC AAC CAC CTC ACC TAC TTT 728 

A Vti MQl' S P ALV P AE L L A P L T 249 

OCT GTT CTC ATG CAA CTC TCC CCA GCC CTG GTC CCT GCA GAG TTG CTG GCA' CCT CTT ACG 788 

Yl SLVGCSlSlVA SIilTVLL 269 
TAC ATC TCC CTC .GTG GGC TGC AGC ATC TCC ATC GTG GCC TCG CTG ATC ACA GTC CTG CTG 848 

..H F- .H- F R- K Q S D. S L T R..-. I H.._ M .^.N_. . L H A 289 

CAC TTC CAT TTC- AGG AAG CAG AGT GAC TCC TTA ACA CGC ATC CAC ATG AAC CTG CAT GCC 908 

S V L L - li N I A F L L S P A F A M S P V 309 
TCC GTG CTG CTC CTG AAC ATC GCC TTC CTG CTG AGC CCC GCA TTC GCA ATG TCT CCT GTG 968 

p G S A C T A LAAALHYA LLSCti 329 

CCC GGG TCA GCA TGC ACG GCT CTG GCC GCT GCC CTG CAC TAC GCG CTG CTC AGC TGC CTC 1028 

T ^W" M A i B G P N L L li" xj— -r-~v N I 349 

ACC TGG ATG GCC ATC GAG GGC TTC AAC CTC TAC CTC CTC CTC GGG CGT GTC TAC AAC ATC 1088 
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Y I R R V V F K L G V L G W G A P A h L 
TAC ATC CGC AGA TAT GTG TTC AAG CTT GGT GTG CTA GGC TGG GGG GCC CCA GCC CTC CTG 



369 
1148 



VLLSLSV KSSV 
GTG CTG CTT TCC CTC TCT GTC AAG AGC TCG GTA 

DSWENGTGFQN 
GAC AGC TGG GAG AAT GGC ACA GGC TTC CAG AAC 

VVHSVL VMGYG 
GTG GTG CAC AGT GTC GTG- GTC ATG GGC- TAC GGC 

VLAWALWTLRR 
GTG-C1K3LGCC-.TGG GCG,.QTG_TGG. ACQ CTO . Cj^CAGG. 

VRACHDTVTVL 
GTC AGG GCC TGC CAT GAC ACT GTC ACT GTG CTG 

W A L A F F S F G V F 
TGG GCC TTG GCC TTC TTT TCT TTT GGC GTC TTC 



YGPCTI PVF 389 

TAC GGA CCC TGC ACA ATC CCC GTC TTC 1208 

MSICWVRSP 409 

ATG TCC ATA TGC TGG GTG CGG AGC CCC 1268 

GL TSLFNLV 429 

GGC CTC^ACG TCC .CTC.TTC AAC CTG GTG 1328 

LRERADA P S 449 

CTG.. CGG. GAG CGG..GCG GAT GCA CCA AGT 1388 

GLTVLLGTT 469 

GGC CTC ACC GTG CTG CTG GGA ACC ACC 1448 

LliPOLFLFT 489 

CTG CTG CCC CAG CTG TTC CTC TTC ACC 1508 



I li N S li YG F F L F L WF C S Q R C R 509 

ATC TTA AAC TCG CTC TAC GGT TTC TTC 'CTT TTC CTG TGG TTC ' TGC" TCC CAG -CGG"- TGC " CGC IS'^T 

SEAEAKAQIEAFSSSQTTQ* 529 

TCA GAA GCA GAG GCC AAG GCA CAG ATA GAG GCC TTC AGC TCC TCC CAA ACA ACA CAg!- TAG 1628 

TCCGGGCCTCCTGGCCTGGAATCCTCAGCCTCTCTGGCCGCCAGTAGCCTGAGGCTAC^ • 1707 

CAGGCCTGCTGCTGGACCCCAGAGGCCACTGTGACCCKrCAAGGGqCCTTTTCCACTTCCACG^ 1786 

GGGGAAGGC ATTGCTCTACCTCrcCCTGACAlTO^ 1865 

CTGGTACCTGGGCCCAGCTCGCCAGGGATGTGGGCAGAGCACa\GCCTGGGCATCAGGAAGCCMGTl^^ 1944 

CTTTGAGTCTGTCTGTATGACCTTGGGCCTGCCACTTCTCACAGACCCTAGG^ 2 023 

CGGCTTTGTTTCAGCCTAACCCAGGAGCTTAGTAAAAATTGCATAAGACCAGGG^ 2102 

ATTCCCGCGGCCTCCACCTGCTTGCTAGGGGCAGGATCTCATTCAGGCT^ 2181 

CTTCCTCCAGGGGAGGGCCAGATGGCATCCTGGCTTGGGGCGGGTGGGACCTACCCA^ 2260 

ATGCCTGAGGCCTCTt5tCCTTTAACTCCCTAAA 2339 

iSGmcaSCCGTTCCCAGAGGCTCCTCCTGCGGTGCTC 2418 

AGTTTTCTTGGGGGCAGAGGAAAACGCTTCTTTCTCCTCCAGCTGAATCA 2497 

'GATTGGGCAAeS^^ 2576 

AATCACCCTTACCCCAGACCTTCATQAGACAGTGCTCATGAAGCCAGTGCGTTTCCC^ 2655 

"gTOGGTCCACACTCAGAGGCCCTTGG^ . 2734 

CAGCTGGAGGGGCCGTAACTGCAGGACTGCGCCTACTGAGTQACCCATTTCCTCCAGGAGGA;^ 2813 
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CACGGCCATTTGTCTCTTTTCCCAATGCGGCGGTOrACT^ 2892 

GAGCAGCGTGCiCAGGTi^ 2971 

GCAGAAAATAGGAGCAGGATTTCCCCTGOOGAAAAGTTCTCCTG<^ 3050 

AAATAACTCCTTCACCAGGCAGTGAGTQGCGTA^ 3129 

TGCTAGCTGTGAGACTGTOGACAAACCACT»GCCTCTGTGT^ 3208 

^.GGTACCTAIOTTGAAGAC^T^ 3258 
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♦ ->CnrtWDgi tC . .Wpdt PPGelVwpCPkyfygf ssdqtdttgn 

• +tC W+ + +++P+G ++ C + +q + + 

187 LTCvfWKEGarkqPWGC?WSPEGC RTEQ— -.PSH 2X6 

vsRnCtedGsWsepppsNrtWmysaCgeddpeeeeekkkkyylvlkiiY 

217 SOVLCRCNH— LTYFA-^--— VLMQLSPALVPAELLAPLTYIS 252 

tvGYSlSLaaUvAwII/llFIOdAtlwpdnadgalevgapWGAPfqvT 

+vG S+S++a 1+ V++ FRk + 
253 LVGCSISIVASLITVLLHFHFRKQS DSL— r 280 

SirCtRNyiHrriNLPlSFlLrAasvfilcdavlHsevssdeperLssrcsls 
tR IHWNL +S +L +++ 4.+ a 8 v+ ++ 
281 TR—IHMmiHASVIJJliNIAFLLSPAFAMSFVPGSA-- 313 

tgqvwgCkllvvfQfqycvmtNffWlLvEGlYIihtLIiVVtffsErkylw 
C +1 ++ 'f+y++++ +W+ +EG L+'LI* + +4-y + 
314 GTALAAA-LHYALLSCLTVmAIEGFNLYIJUL^RVY---^^ 352 

wYl lIGWC^^PlVf vtvWaivRllf edtgCWdsnGIJtfnFPEAKmCiW 

V+ + -f++GVfG+P++ V v+4- ++ +C++++ F 
3S3"4lWfklgVlX2WGAPAlJLrVI^SLSVKSSW-GPCTlP^ FDSWENGTG. 397 

msdnshlwWIlkgPiLlsilV. * MFf IPinllrlLvtKLRaa 

W+ + P++ s+lV + +•♦• ++ N++++ ++ L + liR+ 
398 F-QmSlCW-HSP^ 444 . 

qtgetdqrqY6qYrkl»aKSn.lLIPLfGthywFafrPsndarGvlrkik 

445 ADAPSVR ACHDlVTVIiOLTVl#tOTTMALAPPSP^ VFLLPQ 484 

ly f e 1 sLgSFQGFfVAvl YCPlNgEVQaEirrrW< - * 
1++ L+S+ GFf 4-4 P+ + + 
485 LFLFTItiNSIiYGPF-*-LFLWFCSQRCRSEA£AKA 516 
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10 20 30 40 50 60 70 

inputs ATGACGCOSAGCCCCC^TTGCtGCTCCntSCTGCCGCCGCrroC^ 



BO .90 100 110 120 130 140 

inputs CCGCCCGAGGCCCCCCy^OATGGCQaACAAQQTCSGTCCCAOCWCAO^ 

tilt 

- CACX3 



150 160 170 180 190 200 210 

inputs GCQGCT6CAQT<K:CCAQTQQA0QaGaACCCGCCXK:CQ^^ 



220 230 240 2S0 260 270 280 

inputs CACAOCG<3CTGGACKXXKnTCCGCGTGCr6CCGGAG<3^ 



290 300 310 320 330 340 3S0 

inputs CCGOCGTOTACGTGTOMiAaaCCACCAACXKKJTT^^ 



360 370 380 390 400 410 420 

inputs <mTC3ACATTAOCCa«K3GAAGaAaAGCXrrt3G^ 



430 440 450 460 470 480 490 

inputs A0CCA0CAGTCK3GCACmCCGCQ<n*TCAGACA^ 
s s , J ; s 

- COTCCG - - 

10 

SOO SIO S20 S30 S40 550 560 



570 580 590 600 610 620 630 

inputs CGACCAGOCerTOACXKXKX^GAGGCCCKn^ 



640 650 660 670 680 690 700 
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710 '^20 730 740 750 760 770 

input ACAAGGTGOATGTGATCCAGCGXSACCCGTTCCAAGCCCGTQCTCACAGGCAC^ 



780 790 800 8X0 620 830 840 

Inputs (30TGOACTTCXK;QGG<3Aa:ACGTCCTTCCA6T6CAA^ 



8S0 860 870 860 890 900 910 

inputs CTQAAQCXitCCn'GGAaTACGOCGCCaAGOGCCGCCAGAACrCC^^ 



920 930 940 950 980 970 960 

inputs TGGTG(n«CCCACGGGTGACX5TOTQaTCGCGGCCXX5AC^^ 

3titii:sttt: s;::;M««isttt :t titttitttttttt tMxiitttt::*: 

* *0<XCACX3G<mSATX?Kn<3(yrCA 

20 30 40 so 60 70 

990 1000 1010 1020 1030 1040 lOSO 

inputs TGCCCGCCAGaAaSATQOKKJCATGTACaV^^^ 

: i : : t : : : : 2 : : : : : t t j t i t : 1 1 : 1 1 s t : t s t . t t : t it ituxxtxsiiiti t « t : : at 
Qa<XGGGC»<36AaX3ArFeeTQGGAT6TAeATCT6€^ 

80 90 100 110 120 130 140 

1060 1070 1080 1090 1100 1110 1120 

inputs GCCr^CCrCACCGTOCTOCCAaACCCAAAACCXKXavG^^ 

: : : t 1 : s : : : : : : * : « : : : t : : : t i s : : i t : t 2 : t s i : it : : . t t : : . m : : * : 
(XlCrTCCTCACrGTATTACGAGACCCCAAACCTCa^GGGCC^ 

ISO 160 170 180 190 200 210 

1130 lUO IISO 1160 1170 1180 1190 

inputs GCCTQCCGTGGCCCOTOQTCATCGGCATCCCAGCOQQCOC^ 

t zt tt tttttttxitsitx a tittittitttttt^tsiti i i t t tt : s 

220 230 240 250 260 270 280 

1200 1210 1220 1230 1240 12S0 1260 

inputs GCTTTQCCAaXCCAOAAOAAaCCGT^ 

s i s : $ : 1 t s : • s t i s : : s i « 1 1 1 , 1 1 . 1 3 1 1 : t . i x • t t itt i t : s t : i 1 1 : it : i i : . 
QCTrrCKX^GACCAAaAAGAAOCCATGTGCCCO^^ 

290 310 320 330 340 350 

1270 1280 1290 1300 1310 1320 1330 

inputs GQQPJCaGC(XSQ(yiPXl<^^ 

i X i t i * Hit n t ittx tt^tttxixtxtit : t : : : « i : t s t : : 

GGGAaVTCCCX3A<yACX3CAGlX30TGAGAAaGACCr^ TO 

360 370 380 390 400 

1340 _ 1350 1360 137 0 1380 1390 1400 

"inputs TOGOOCflfiaTCTQAGGAGC^ 

Its: : * : i t j : } : : 1 2 : : : ; . 2 : t * . : ; : t t : : : t : : : . : an i : . : t t t s . i t : : t : 

TQGaCATATOTCy^GaAGCATG<3ATOMCCATQ<KXXCCC^ 

410 420 430 440 450 460 470 

FIGURE 27B 
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1410 1420 1430 1440 1450 1460 

inputs CCCTAAOTTQTACrCCAAACTCTACACAaACATCCACAaiCACAa^C^ -CACACAC- -TCTCACACACA 
tit t : : : I s :: t i i i : i t $ « i . : : t : i : i : i i x : : t : : : : t t t t : ^ : . : t t t t . 

CCCCAAGCTGTACCCCAAGCrATACACAGATGTGCAa^CACACACAC^^ 

480 490 500 510 520 530 540 

1470 1480 1490 1500 1510 

inputs CTCACACGT-GGAGGGCAAGGT-C CACCAGCACATCCACTATCAGTGC 

; : : . : J : : : s : i : J :::::: : ; s : : ; j : : , : t t t j : t j j : i t i : 

CTCrCATGTTGGAGGGCAAGGTTCATCAACRCCAGCATGTCCAC^^ 

550 S€0 570 560 590 600 610 



CAAOCACTGTGTCC 
€20 
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970 980 990 1000 1010 1020 1030 

GTGCTGCCCACGGGTGACOTGTGaTCXSCGGCCCXSACXKOTCCT^ 

i i i i i t : : : : : t : : : : : 2 i : M ! : t :::::::: t i : : : 

GTCCXMCCCACGOGTGATGTGTGQTCACGQCCTGATaKrrCCTACCT 

10 20 30 40 SO 60 70 

1040 1050 1060 1070 1080 1090 1100 

CCCGCCAGGACQATGCGGGCATGTACATCraCCirrGGCGCCAACACCATOOQCTAC^ 

CCCGCCAGGATGATGCTGGCATGTACATCTGCCTAGGTGCAAATACOVTGGQ^ 

80 90 100 110 120 130 140 

1110 1120 1130 1140 1150 1160 1170 

CTTCCrCACCX3TacraCCAGACCCAAAACCXKX:AGGQCCACC^ 

: t ; : : t : : : : : . : : : : : i :.:::«:; t : t : } : * t t : : . : t : t . t t t 

CTTCCTC»^CrOTATTACCAC3ACCCCAAACCrCCAOQ<KXnx:CT^ 

ISO 160 170 180 190 200 210 

1180 1190 1200 1210 1220 1230 1240 

CTGCCGTGGCCaSTGGTCATCXSGCATCCCAGCCGaCQCTaTCT^ 

::ssi.t;:t: ::st: tit;ti:::«i::: {:s::::t;ts::t«x:$:: tti: t: 1 1 ti 
CTGCCAT0<KX:rraTGGTaATOKKaVTC^ 

220 230 240 2S0 260 270 280 

1250 1260 1270 1280 1290 1300 1310 

s : t t : : : : s : : : . i : * : i : : : : . : : i 1 1 : t « « : : « t : : : : i : . 1 1 

TTTOCCMACCyUVGAAaAAGCCATGIXXXrCAOCATCrrA 

290 300 310 320 330 340 3S0 

1320 1330 1340 1350 1360 1370 1380 

GACGGCCCGCGACCGCAGCGGAGACAAGGACCrrCCCTCGTT^ 



GACATCCCGAGAAOSCAGTGaTGACAAGGACCTGCCCTCATTG^ TOTG 

360 370 380 390 400 

1390 1400 1410 1420 1430 1440 1450 

GaCATATOTGAGGAGCATGGATCCGCCATGGCCCCCCAa<:y^CATCCTGG^ 

410 420 430 440 450 460 470 

1460 1470 1480 1490 ISOO ISIO. 1520 

CTAAGTTGTACCCCAAACrCTACACAGACATCXa^CACAC^ -GACACAC— TCTCACACACACT 

:::: $::iitt::;.:t t:::x::: :s:i:t:3s:;:;: titssi: !.:.i::::,s: 
CCAAaCTGTACCCCAAaCTATACACAOATOTGCAaiCAaua^CA^^ 

480 4Sncr* 500 510 520 530 540 

1S30 1S40 1550 1560 1S70 1580 

CACACGT ♦ GGAGGGCAAGGT - C CACCAGCACATCCACTATCAGTOCTAQACGGCACCGTATCTQC 

: . : t : : ::::!::::::: : t . : : t : } : s : : t : : t s t : . : * : t t : . i : : : : 

CTOVTCnTGGAGQQCAAGaTTCATCAACACCAGCyVTGTCCACTAT^ 

550 560 570 560 590 600 610 

1S90 1600 1610 1620 1630 1640 1650 

AGTGQGCACXKKKKK^0Ga:CAGACAGQCAGAC1X:^^ 

: . t t . : : : : : t « i . t : * i t : i t . • 1 1 s 

AA- - -QCACTOTGT- - CCTGA- -GGTAOQCAT- * TTOGGGGCCAAGGCAACAa- -OTTOa- -0 

620 630 640 650 660 

FIQURE 28A 
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1660 1670 ' 1680 1690 1700 1710 1720 

GGGACCCATGCKXSAGGAOOAATGGCCAGaVCCCCAGGC^^ 

: :: : : ::: ; : . 

ACJAATTGAGAACAATGOAGGAAG- - -AOTATCTTAGGGTOCCT-TATaGTGOACA- • -CTCACAAACTTG 
670 680 690 700 710 720 

1730 1740 1750 1760 1770 1780 1790 

CACAO^cyiGACACACACACTCKXriWJA-TaCATQTATQCA^ 

! 1 i i i i . t . I 1 I « J : . : . . . . : : ; : : x .si * ; . ; ; t $ ; t t i « : i * . 
GCCATATAGATGTATGTACTACCAGATGAAaiOCCAOCCACa^TTCACACAaSC^ 
730 740 750 760 770 780 790 

1800 1810 1820 1830 1840 1850 1B60 

GCACACGTACGCA- CA- CACGCACATGO^CAGATATGCCQCCnxaaaCACACAQATAAGCTGCC^^ 
. i I . : : . 2 . : ; : . i : j . » : : s . i : . i : : , . : : : . : : ♦ « . ♦ : . , , : t : j , . i : : 
AAAOSTGTGCAGAACTOCAaVCACAA-C-CTQAGAAACCOT 

800 810 820 830 840 850 860 

1870 1880 1890 1900 1910 1920 1930 

AO3GACACGCA-Cy^GAGACATGCCAaAACATACAA0GACATG-CTaCC^^ -CACACGCACACC 
t : : : } t t : . : : s t : : : .* . . i i t . : : t • i : : : : 

AGTGACATOTAGCGATGGCTAQTTGAAGGAATCTCCCrCATQTCTTAOT^ 

870 880 890 900 910 920 930 

1940 1950 i960 1970 1980 1990 

CATGCX3CAGAMTG- - ^-CTCyxmGACACACh^ 

: jit it : : 1 1 ttnutt : . * « * • : « it t • t : : is::. . t i : t * . : 
CCT0C«a^TClt3TQTTCC?rGCCTGGCCrr^ 

940 950 960 970 980 990 

2000 2010 2020 2030 2040 2050 2060 

AGATATOGTATCOSGACAavCACGTGCAa^aATATQCr^ 

:. s.:t * . :.:s: .zss.tix. i . , ,;»:.«.:: :.: 

C- - -CTATCAACCTGACTGOOGTGAaCA GTGCAGCCATGCmXSQACQTOaAGCCACC- - - -CTC 

1000 1010 1020 1030 1040 lOSO 

2070 2080 
ACACATOGACQGATATTa 

I t * : t : « 
CC-CTTGCTAQAGAGAAG 
1060 1070 
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10 20 30 40 50 60 70 

input S MTPSPLLLU.LPPLUiGAPPPAAAARGPPKKADKVVPRQVARLGRT\mLQC 



80 90 100 110 120 130 140 

input B HSGWSRPRVLPQOLKVKQ\^REDAGVYVCKATNGFGSLSVNyTLVVLDDlSPaKESriGPDSSSGGOEDPA 



150 160 170 180 190 200 210 

inputs SCK3WARPRFTQPSKMRRRVIARPVGSSVRIiKCVASGHPRPDlTWMKDIX)AIiTRPEAAEPRK^^ 



220 230 240 250 260 270 280 

inputs UlPeDSGKrrCRVSKtU^IHATYKNmVIQRTRSKPVLTQ^ 

. t * 

- --RVR r*" 



290 300 310 320 330 340 350 

inputs LKRVEYGftEGRKKSTXDVGGOKFVVLP1XS3VWSRPIX3SYX^KKi^ 

: 1 1 : : : t : : : : s : : : : : I : . s s t s $ : s : : : : : : ; s ; 1 1 I z : t : s 

PTC^VWSRPOGSrLOTOJtilSRARQDDAOMYiaUGAKW 

10 20 30 40 

360 370 380 390 400 410 420 

input S AFLTVLPDPKPPGPPVASSSSATSIiPWPWIGI PAGAVFILGTLLLWLCQAQKKPCTPAPAPPLPGKRPP 
: t t i t •:::::*::<: t i i t t t . :<:::::: 

AFLTVXiPDPKPPQPPMASSSSSTSLPWPWIQIPAGAVFlLGTVLLWLCQTKKKPCAPAST^ 
SO 60 70 80 90 100 110 

430 440 4S0 460 470 480 

inputs GTARDRSGDKDLPSLAALSAGPGVGIiCBBHGSPAAPQHZiLGPGPVAGPKLYPKLriT>IH^^ 

i t , : . t : i t : i i t : : s s : . r i i s i : . : ; : ; ♦ : . . j « ♦ ; m .M J s t : .* i . : : : ; J : J . it 

GTSRERSGDKDLPSLA - VQlCEBHGSAMAPQHIIiASaSTAGPKLYPKLYTDVHTHTHTHTCrr^ 

120 130 140 ISO 160 170 180 

490 SOO 

inputs — --SHVEGKVHQHIHVQC 

: . . : 

XiSCWRARFIHTSKSTZSAKirSESPSTVS 
190 —200 



FIGURE 29 
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ATGTCACCXSCCTCTGTGTCCCCTCCTTCTCCTGaCTGTQOaCCT^ 

10 20 30 40 50 60 70 



inputs 



GTOATCCaU^TACCTGCAGCTTCTGGGAAAGCTTCaiCTACCACCACCAAOT 

80 90 100 110 120 130 140 



inputs 



CA<K:CTaCrrCCCCrCAGAGCrcrCKX3A0CGGCCCTaaGAGQGCCC 

150 160 170 leO 190 200 210 



inputs — 

CAGAGGAAACTCCKKKriTCTAGGGATTCATTCTGCAa^ 

220 230 240 250 260 270 280 

inputs 

ATCX3TAGTGCACTGCAACC»CAAACA0GGAATGCOCT^ 

290 300 310 320 330 340 350 

inputs — — ^^^^^^^^^ 

360 370 380 390 400 410 420 

inputs — 

TGCCATG<3Xrrt(:rrATGAGAGCAGGGGGTTC^ 

430 440 4S0 460 470 480 490 

inputs — , . 

GT0TGaCACCCAATCAGTGCCAATGTGTGCCACKKnX3<^^ 

500 510 520 530 540 SSO 560 



inputs 



CCTTCAGCCXrroTACCCCTOGCTACTATGGCCCT<K:Cri^^ 

570 580 590 600 610 620 630 



inputs 



TGfCOATGCCXyVGACTGGAGCCTGCTTCTGCCCCGaVQAOJ^^ 

640 650 660 670 680 690 fOO 

Figure 30A 
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inputB 



CCCAGGGCACCTCrGGCTTCTTCTGCCCCAGCACCCATCClTGC 

710 720 730 740 750 760 770 



inputs 



ACAGGQCTCCTGa^GCTGCCCCCCrGOCTGGATGGGCACXATCTQCT^ 

760 790 800 810 820 830 . 840 



CACXSGACCO^CTGCTCCCAGGAATGTCXSCTGCCyiau^CGaa^ 

050 860 870 880 890 900 9X0 



Inputs 



QCCGCIXKX3CTCCGGGTTACACTGGGaAT03GTGCC<^^ 

920 930 940 950 960 970 980 



Inputs 



CTGTOCTGAGAOGTGCXaACTQOCKrCCGGACXX:^ ^ 

990 1000 1010 1020 1030 1040 lOSO 

inputs ** — 

CACXKSCrTCACTOGGOACaSCTQCAaSGATCGCCTCrr^ 

1060 1070 1080 1090 1100 1110 1120 

inputs- -------------CGACC--- ------ ---------- — - — — 

: : t : ! 

CCCCCn^CACCTGCGAGCGGGAGCACAGCCraVGCTGCCACCCmTaAA 

1130 1140 IISO 1160 1170 1180 1190 

10 

inputs CACX3C- - 

: M t : 

GGGCrGGGCGGGCCTCa^CTOCAACQAaAOCTGCCCGC^^ 

—1200 1210 1220 1230 1240 1250 1260 



inputs 



TarCTCTGCCTGCACGGTGQCGTCrQCCa^GOCTACCAGCGGCCTC^^ 

1270 1280 1290 1300 1310 1320 1330 



inputs 



GCCCrCACTQTQCnCAGTCTTTOTCCTCCTtSAC^ 

1340 1350 1360 1370 1380 1390 1400 

FIGURE 3 OB 
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inputs 



AAATGOATCXKrCTGCTCACCCATCGACGGCmGTOC^TCTGC^ 

1410 1420 1430 1440 1450 1460 1470 



inputs 



TCTGTGCCCraCCCACCCGGAACCTCKKSQCTTCAGTTGCA^ 

1480 1490 1500 ISIO 1520 1530 1540 

inputs . — — • . Q 

ISSO 1560 1570 1580 1590 1600 1610 

20 

inputs TCCQ------ « ..^ •QTGACCCT 

5 : ! s s 1 8 : J : : : 

TCCGAACKKSOCAGTTTGGAGAAGKJTTGTGCCAGTCG^ 

1620 1630 1640 16S0 1660 1670 1680 

30 40 SO 60 70 80 90 

inputs GTTCATOaAaVQTGCCGATaTCAGGCrGGlTGCJATGGGCAC^ 

: 1 1 X s I 1 1 t s . : ; t * , t s 1 1 1 1 : t i s tttum . t t : t : t : : : » j t : : t i i i i i 1 1 : i t t 
GTTCATGOACGCKm:XGTGCGAGGC^^ 

1690 1700 1710 1720 1730 1740 1750 

100 110 120 130 140 150 160 

input B TTTGGGGAQCCAACrrGCAGTAACACCrrGTACCTGCAA 

t«tit:t3S «:tt:: :i ttttttst :t}:i:::tMtit a t t s z t i : 
TATGOGGAGTCAACTGTAQCAAC/^CCTGOVCCrGCAAGAATGGGGGCAC^^ 

1760 1770 1780 1790 1800 1810 1820 

170 180 190 200 210 220 230 

inputs CrGCGTGTGC^CACCAGGGTTCCOAGGCCCXrrCCTQCCAaAG^ 

;:};::t:t :;«stt::.:;i::tt::::;:i;::. :ti: : :::::::::::: 

CroaSTOTGTGCACCCaQATTCCXKSGGCCXrCTCcnW 

1830 1840 1850 1860 1870 1880 1890 

240 250 260 270 280 290 300 

inputs CXSCrOTGTGCAATGCAAOTGTAACAACAACXAT^ 

$tti£:tt:: ttt::::: t t i x i :: t itttiw:: !t:.:i:t::::::t: ::::: 

dSCOTTGTGCCCTGCAAGTa - - - CGCTAACCACTCXriTCTGCCACCCCTCGAAC^^ 

— rSOO 1910 1920 1930 1940 1950 

310 320 330 340 350 360 370 

inputs TGaaSGQCTOGACAGGCCCTGACTGCTCCGAGGai^ 

:iti stits:a:::M:s tt t:t:: :t::t::s: . x i tt :::: 

I960 1970 1980 1990 2000 2010 2020 

380 390 400 410 420 430 440 

inputs ACTCTGCOlOTGTCATCATQGTGGaACCn^ 

ti:t::.t:::t iststttsttistitst it:iit8:i:t:i:::ti::t:t: t t it::::: 
GACOTOCCAATOTOlCCATGaTXMGA 
2030 2040 2050 2060 2070 2080 2090 

FIGURE 30C 
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450 460 470 480 490 500 510 

input 8 ACTGGACCCAACTGCTTGGAftGGCTGCCCACCAAGAATGTT^ 

ACKSGACACa^CTGCTTAGAAGGCTGCCCrcr^^ 
2100 2110 2120 2130 2140 2150 2160 

520 S30 540 550 560 570 580 

inputs GTGATCrCGGAGAGATGTCKrCACCCAGAGACTGGGGCTTGTOW 

GTQQTCCTGGAGAAAAGTGCCACCCAGAGACIXKKWCCTGTOTATGTCCCC^ 
2170 2180 2190 2200 2210 2220 2230 

590 600 610 620 630 640 650 

inputs CTGCAAAATGGGAAQCCAGaAGTCCTTCACCATAATGCCC^^ 

i:ts..:: ttxt ttttitt its; :: .i^ttxtt iti*nzx 1 1 : : : : t . t : : t i t 

TTOCAGGArWGAATCCyVCKSAQCCCITrACnOTaA 
2240 2250 2260 2270 2280 2290 2300 

660 670 680 690 700 710 720 

inputs aCAGTGATTGGCATTOCAOTACTGGGAACCCTCGTGGTa^ 

s : I t t : t : t t s i i : t : t ::«::: i t t :::: t :::: $ 2 t : t t 

GCAGTGATTGOCATTGCAGTGCTGGGGtCCCITGTGGTAGCCC^^ 
2310 2320 2330 2340 2350 2360 2370 

730 740 750 760 770 780 790 

inputs AGTOGKiAAAAQGGCAAaGAACAimaGACTTGQCAGTQQCirrAC^^ 

: : t t t : : ; : : t : t :::: : : : : x : t : t : : : : i : t : s 

'ACntSQeAAAAAGGCAAQGAGa^eCACCACCTaO^ 
2380 2390 2400 24l0 2420 2430 2440 

800 810 820 830 840 850 860 

inputs TTACXjrCATGCaUJATGTCTCTCCXSAGCTATAG 

ts :i:t:titsitt2it t::i:i::xiii;:t«]it2:ti2:s!:s::ai:sttt:t:j i:::: 
GTATOTCATGCCaW3AT(3TCCCTCCQAQCTAa«n'CACTACTACT 
2450 2460 2470 2480 2490 2500 2510 

870 680 890 900 910 920 930 

inputs CAGTOTTCrCCTAACCCC^CGCCCCCTAACAAGGTCCCAGGCAGTCAGCrCT^ 

:::s.t :: ti.t:::it:t.:t::t:t:tt:s:; zmtt t ::::;:t: tttt sst. 1 1 
CAGTGCrCCCOVAACCCCCCACCCCCTAACAAGaW - -CaKHCTTTGCCAaCCTQCAOAACC 

2520 2530 2540 2550 2560 2570 2580 

940 950 960 970 980 990 1000 

inputs CTTGAOCOGCCyVAaCAGAGCCCACGGGCGTGAaAACCAT^ 

: : : t } : : ] : t : . : . z . i : i : : : : : i . : i : : : : : : : ; : i : tit:: : i : x : : : ; : s : s : t : : : 3 s t 
CrrGAGO3GCCAGGTaG<K;cayUVQG0CATGATAAC^ 

2S9tr~ 2600 2610 2620 2630 2640 2650 

1010 1020 1030 1040 1050 1060 

. input 8 GGAGCCCCAT- QACAQAGGCGCCyiaCCy^CCItSGACCXSAAGCrrATAaCroTAGCTATAGC 

GGAGCCCCCrrCCAGGGCCTCTGGACAGGGGGAGa^GCC^ 

2660 2670 2680 2690 2700 2710 2720 

1070 1080 1090 1100 1110 1120 1130 

inputs CACAOGAATGGCCXAGOACCATTCnxn^CATAAAaaTCCCATCrC^^ 

s t :::::: : t : : i : j : . :::::: s : : : : i s : ; : : t : : i : . « : : : : i s J ; : s » • 

AATGGCCCAGGCCOlTTCrrAaSATAAAGGGCTCATCrr CrOAAOAGGA^ 

2730 2740 2750 2760 2770 2780 

FIGURE 30D 
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*1140 1150 1160 1170 1180 1190 1200 

input s TGTCCCTGAGGAGTGAOAACCCCrrATQCTACCATCaSAGACCTOCCCyvqCCrrGCCT 

:::::::: I 1 : : : i : t i : i . m t : : : t i : : : : ::t:.:j!t. i;:::.:: 

CrrrCCCTGAGCAGTGAOAACCCATATGCCACCATCCGGGACCn^ 
2790 2800 2810 2820 2830 2840 28S0 

1210 1220 1230 1240 12S0 1260 1270 

input 8 AAGTGGCTATGTCKJAOATQAAAOOACCTCCATCAQTGTCCCCrrCCCAGGaVGTCTCOT 

GAOCAGCTACATGGAGATOAAAOaCCCTCCCTCAGGATCTGCCCCCAGGCAGCCTCCT 
2860 2870 2880 2890 2900 2910 2920 

1280 1290 1300 1310 1320 1330 1340 

inputs AGGCAQ---CAGCGGCAACTXKy^GCCACAGAGGaACAGa3GCACCT^ 

i i : i : : « : ; : s : } ; i i t i : ; j i : : ; ::::::::::::::: : : : : : 

AOCCAGAGGCGGCGaaUlCCCCAGCCACAaAaAaACAGTGGCACCTACaAG^ 
2930 2940 29S0 2960 2970 2980 2990 

13S0 1360 1370 1380 1390 1400 1410 

inputs ATAATQAAGAOTCnPTTGGQ<n'CCACGCCaXG^ 

1 1 • « . t M t } I : : : t 1 1 : s 1 1 i i s : 1 1 1 1 m I . £ : : 1 1 . t s it : t : : i ; s 1 1 1 1 s : s 
ATGACCGAGACrCTGTGQGCTCCXaVGCCCCCTCTG^ 
3000 3010 3020 3030 3040 3050 3060 

1420 1430 1440 14S0 1460 1470 1480 

input© CAAGAACAGCCATATCCCTGCUlCACTATGACTTGCCTCCAOTAC^^ 

: : t : I : : i : : : : : ; 1 1 : : : : : s : ::::::::: i : t i i: t t : : « $ : : : j . 

3070 3080 3090 3100 3110 3120 3130 

1490 

inputs CGCCAQOACCGC 



CGCX».GGACCGT 
3140 31S0 



FIGURE 30E 
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1890* 1900 1910 1920 1930 X940 1950 

0ACXACTC1X3Al«QCrKSTGACCCTGTTCATGGACGCTO 

: : t : t . t t : : i t s :: t « ; ; : . . : : : i j : : : t i : t : : : : : t . • : : : t 

GACC-CAC-QC»TCCGGTQACCCTGTTCATCX5ACAGTGCCGAT0TC»0GC^^ 

10 20 30 40 50 60 70 

1960 1970 1980 1990 2000 2010 2020 

GCCACCTGTCCTGCCCTGAGGGCTTATCKKMAGTCAACTGTAGCAAC^ 

i jj::: i t i : :!::::;:;::::; ;: 

GCCACCTGCCTTGCCCOGACWGCTTTTOGGOAGCCAACrro 

80 90 100 110 120 130 140 

2030 2040 2OS0 2060 2070 2080 2090 

CACCTOTCTCCCTGAaAATGaa^CrrQCGTGTGTaCACCCm^T^ 

i :: s t t : :2{i: > t * t : : t i . j : : t : t : : : t t : } x : : : . :: 
TACCTQTGTGTCrraAaAATGGCAACTGaSTQTGCXKaiCC^ 

ISO 160 170 180 190 200 210 

2100 2110 2120 2130 2140 21S0 2160 

TGTCAGCcrrGK3ccx3CtATG<x»^ - -(MCTAACCAcrccrrcrraccAc^ 

t: t ittiii i i : : : : : M : 2 1 X t : : i : : i s : :::tf::s t. mtt it s ::ts:3s: 
TGCXrCGCCTGGTCGCTATGOCyVAACGCTGTGTGaAT^^ 

220 230 240 250 260 270 280 

2170 2180 2190 2200 2210 2220 2230 

CCTCaAAa3GGACCT0CTACTGCCTGGCT<^^ 

: : : 2 . : : : s 2 : : : i : : : : 2 : 1 1 : : : : :::: t :::::: i t t s : : t t : : : : : : : : : : : : : : : : : 
CATCOaAaSGGACCniXKrrGCTOGGTGGGGGOG^^ 

290 300 310 320 330 340 350 

2240 2250 2260 2270 2280 2290 2300 

ACavCTGGGGAGAAAACTGTGCCCAQACCTGCOUiLTGTCACCAT^ 

: : t : t : : : : . : : : t t i : t . i :: t i m :: i t t : : ; : : : t t : : : t : : 
CCACTGGGGACrcyUU^TOCTCCCa^CTCTGCCAGTQTCy^Ta^T^ 

360 370 380 390 400 410 420 

2310 2320 2330 2340 23S0 2360 2370 

AQcraTATCTOCcccxrrAaGCTOaAcrGQAa^cavc^^ 

: : t ::::::::: i : t : : t ;: i t t t : i i . . : . t . t t : t t t : 

AG<nx3TATCrrGCAa3CaiGG<nX3aACTOaACCCAACTO 

430 440 450 460 470 480 490 

2380 2390 2400 2410 2420 2430 2440 

CTAACTaCTCCCAGCCATGa:AGTQTGOTCCrQ^ 

:$::::i:3t::: ts:ii:t,:: t:s:i.s.;}tt:::ttit3:s::t:t::: t: 

TCAACraCTCCCAOCPATOTCAOTGTaATCrCWaAGA^ 

—^0 510 S20 530 540 550 560 

2450 2460 2470 2480 2490 2500 2510 

TCCa:CAGOGCACAGTGaTGaVCCTTGCAGQATTOaAATCC^ 
tti::t:::.it:tt::i3::t :t::..t: :::: it:: :: 

TCaXCAGC5ACACAGTGGTGCAQAa^Ka^AAAT<X^^ 

570 580 590 600 -610 620 630 

2520 2530 2540 2550 2560 2570 2580 

CCAGTAGCGTATAACTCaCTGGGTGCAGTGATTGGCATTaCAGTG<^^ 

it M::t:s:ti::::::i:t::tt:::4t2t::.*t:ti :t;i:.:ii:it. 

CCCMTGACCCATAACTCACTGOGTOa^lTaATTaaCATTO^ 

640 650 660 670 680 690 700 

FIGURE 31A 
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2590 2600 2610 2620 2630 2640 2650 

TGOCACTGTTCATTGGCrATCX3aaVCTaaCAAAAAGGaU\C^ 

TAGCACTCTTCATTGGCTACCGCCAGTGGc:yW^G<K5CAAG0AA^ 

710 720 730 740 750 760 770 

2660 2670 2680 2690 2700 2710 2720 

CAGCXSOTCGCCTGOACCXSCTCC^AaTATGTCATGCCAGATGTCX:^ 

s ! : : : : : t : ; t : : s : : : i : : i ;:::::{;:::::;: : : i t : : : : t : t : ; • • 

CACTGGGCGGCTGa^TGacrCTGATTACXSTCATGCCAGATGTCrrCT 

760 790 800 810 820 830 840 

2730 2740 2750 2760 2770 2780 

AACa:CA<Xn:'ACavaiCCCTGTCGCAGTOCTCCCCAAACCCCCC^^ " 
t : X t t : : I 1 1 : X t 1 1 1 X t : : : : t : i ; : : s : x . : t : t t : : : . i : { : i : t : ; : t : ; ; : : : : t : 
AACCrCAGCTACCACACMTGTCTCavOTGTTCTCCTAACCCCCC^ 

8S0 jB60 870 880 890 900 910 

2790 2800 2810 2820 2830 2840 2850 

CXKrrCTTTQCCAQCCTOCAGAACCCTGAGC^^ 

xxstx::: xt:: txi. ttit:i:]i:2::.: .t4xt::; ::::x it::: :: 

AGCTCmxSTOVGCTCTCAGGCXJCCrGAGC^^ 

920 930 940 950 960 970 980 

28$0 2870 2080 2890 2900 2910 2920 

GCCrOCTSACTQQAAOCACCXKXmaAOCC^ 

it i : i : : : X : X : : : : : : : : X : : i : t : : : : i : x : x x t : . : 2 . : : i : : . t i x x x : : 

aCCCGCTGACrOGAAQavOGeGCGGGAOGGCCAT r -GACAGAGGeGCCAQCCACCTOGAC 

990 1000 1010 1020 1030 

2930 2940 2950 2960 2970 2980 2990 

CQAAGCTACAGCTATAaCTACAGC- AATGGCCCAGQCCCATTCTACGATAAAGaaCTCATCTCTG 

itxxxtxs xxxi.xxiixx It: xsxxxxix::: xiixxii* txxiit: x iixtxtxx 

GXSAAGCTATAGerOTAaCTATAaCCACAGGAATGGCCCAGGACCATT^^ 

1040 lOSO 1060 1070 1080 1090 1100 

3000 3010 3020 3030 3040 3050 3060 

AAGAGaAOCTCGG<XK:CAGTGTGGCTTCCCTaAGCAGTaAGAACCCA^ 

AAGAGGGACTAGGGGCAAGCGTTATGTCCCraAGCAGTGAGAACCX:C^^ 
1110 1120 1130 1140 1150 1160 1170 

3070 3080 3090 3100 3110 3120 3130 

CAGCTTGCCAQGaGGOCCCXOaaAaAGCAOCrACATGQAGATaA 

: X ; X s j X X . ; X t X • x t x x x « : x « : x . : t : x . : : t : : x : : x x : t : t t t i : t : *. t . : : x x x x 
CAGCCTGCCTGOGGAACXXXrGAGAAAGTGGCTAT^^ 
1180 1190 1200 1210 1220 1230 1240 

3140 3150 3160 3170 3180 3190 3200 

AOaCAGCCrCCTCAGTTTTGQQAa^GCCAOAGGCGGOGGCy^ 

ittxix'xx: I i I- : xxxxxxi :-xi :*:xitxxt: xxx:tx:::it.t:;:: ::::::: 

AGGCaaTCTCTTCATCTCCGaOACAGGCAO CfiaCQQCfACTiSCAQCC^C^^ 

1250 1260 1270 1280 1290 1300 1310 

3210 3220 3230 3240 3250 3260 3270 

ACaAGaVGCCava<XrCCTGATCCATGACCGAGACTCTGTGaGC^ 

: X t : : X : X : X X X X t X X :xx xtxs.x .xtt tx: xxxx:::: xtxxtx tx xxxxi.xxxxx 
AimGCAGCCCAGCCCCTTGAOCCATAATQAAaAaTC^^ 

1320 1330 1340 1350 1360 1370 1380 

FIGURE 3 IB 
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3280 3290 3300 3310 3320 3330 3340 

ACCCCCC<3qCCACrATGACrCACX:CAAGAACAGCCACATCCCrraGACAl^^ 

* i i • ' ' i i : • i t I I t 1 t I : : : M t : : : t : ; : ! ; ; ! : j i : : t ; t $ j . t . 

GCCTCCTGGTaiCTACGACTCrCCauVGAACAGCCATATCCCrGGACACTATC^ 

1390 1400 1410 1420 1430 1440 14S0 

3350 33€0 3370 3380 3390 3400 3410 

CATCCCCCATCACCTCa^CTTCGACGCC».GraACOOTTCUia<^ 

t t : ! J ; : ; } : tit::: t t ,;:!:: j j :: j : : t i .: t j j j . : : . . t ; 

CATCCrCCATCCCCrrCCATCCC(KX^3CCACmCCX3CTQAAGAGCa30CAT^ - -GGAGC 

1460 1470 1480 1490 ISOO 1510 

3420 3430 3440 3450 3460 3470 3460 

ACCTGacrGTTGCTGCTCAAGGCTG<30<y^aVGA(XX^^ 

:.tt::t ttt::: :tsss:ss::it:st tstss:. 

- OTaCCTA-TOTACCT-TOCCAaaAaCAaOGACTGGACCA 

1520 1530 1540 1550 

3490 3500 3510 3520 3530 3540 3550 

GOlGGCTGTXSAACATGAACAACGCrTAACAGAGGAAGTGATGO-G^ 

i::::: • ::::: «:t:i m:.* ..«;« t.*t :: tixt t ;:; :ts** 

GCAGGCCACGAACAGAAACA* - -CTTGGTGAAGTGAAOVGAQACGQAerGTGGCCCrGTGCT^ 

1560 1570 1580 i590 1600 1610 1620 

3560 3570 3580 3590 3600 3610 

GQGAOACXKllTQATCAGCACmTGCCTGOCTCCCT^ - - 

..:t...:: : :::: :::::jt::::t2 i t i i ;t 

GGGAOACACTAGTTGAaVAAGTQTCTAACCCTCTTTTCCAACCCA^ 

1630 1640 1650 1660 1670 1680 

3620 3630 3640 3650 3660 3670 3680 

- -CCTGTGTACATAAACrGGTGGGTTGOAAGTTGCTGGGTAAC-T 

: : : : . : : : : . : : : : : : . : . : . : . : . : : : . . t . } . i : : t : s t : 

AGCTGGTGQGCAQAATGTTGTTGTACAAGTGTGATTTTAQATC ^ 
1690 1700 1710 1720 1730 1740 1750 

3 690 3700 3710 3720 

ACCTTTTCTGTGC- - ATGCTCAQCCTGGGCTCTGTGCGT 
:; s !::::::: : : : : t : : : : : « t : i t t : : 
ACCTTTTCTGTGTGTATGCTCAGGCAGG- - -CTGTG- - - 
1760 1770 1780 1790 

3760 3770 3780 3790 3800 3810 3820 

AG-GCAGaTrCTOTCCrAaGGCACTTACaVTrrAGTAG<mGATGQAACC^ 
<x . i:::::ti ::;t::*t::: «• i« i * i 

GGTATAGGTTCTG- CXTTTCTGCACTrrCCATCTTATCTA^ - CTTCCAAGCTTA- ACTAOTTA 

TTSO 1840 1850 1860 1870 1880 

3630 3840 3BS0 3860 3870 3680 3890 

TAGCCrCCTAACTGGCCTCCrCCATTGATTCAGTQAACCm 
is::::. : * : . . . . j : : t : : . : : t i 

QAGC- TCCA CCAGCAGCA- -OOCCCTAACTACCTQQCT -GCCC TTCA C 

1890 1900 1910 1920 1930 

3900 3910 3920 3930 3940 3950 3960 

AGOCTGGTTAQTTACTCCaPACCrOAAAaCCTTCATAGGTQC^^ 

! . ! s . : J . . t i j i : t : t , i t i t t i a t » t . t 

- - -CCAGTAA- -TCCTCCATGTCT- -TTGC- -TCAGAGOA TTOCTC OC CGACTC 

1940 1950 1960 1970 

PIGFURE 31C 



3730 3740 3750 

GTGTGTQTTTCFGTQATTTTAGAAQGQTACC 

: i t : : t 3 : t . : : t t : t . : t . . : 
-TaTOTCTCTAGTTGQCrTTAGAGGQAOTCA 
1800 1810 1820 



52/85 



3970 3980 3990 4000 4010 4020 4030 

TTTTGAAGGCOTAAAGGCCCTXJCrrTTGCCTOaCCa^TC^ 

: t t . . : i it i t t t : * : s t t : . : t t : : t « : : : s i : : : : : • 

TGOTGTTGTCCT CCTGOTAOGCCTTG ACOGTC-CTOCAGTCTC-CCT tTC 

1980 1990 2000 2010 2020 

4040 40SO 4060 4070 4080 4090 4100 

crOTCACTGCACqCCAQTCACACCGGCCTCTAQQTCCTCCTGTAQGCCACT ^ 

CCGTCT-TGCT- -TCATTCTTTC- - -CCAOAATQAAGOC-TGTCTQCCACCCrACrTCCCAGCCC^ 
2030 2040 20S0 2060 2070 2080 

4110 4120 4130 4140 41S0 4160 4170 

CCTGaVCACCTOaAGTGCCCTTCCTeXCaaiCTC^ 

i:::t ::*«. : i. tti :t : : .:: ::i t :t::t:t ttti ax i tst 
TTGGCACATCTAAGTT- - - CAGCOTTCCTAAGTTACCSCaTTaAaTCCTGCTTO -CACATATTCC 
2090 2100 2110 2120 2130 2140 21S0 

4180 4190 4200 4210 4220 4230 4240 

TCAGGGAAGTGCCCACC<rrCXX3TACATCTrrCACAGCCCnX3^ 

• ts:«. i s::ttt it mm:: *t« : M.M..ti t im*: ::t imm 
ACAGAACA- - -CCCACC- - CC- -ACATCT- - -GCTTC- * - - ATAGCTACrCTCTTCTC-CAC- - -QTACC 
2160 2170 2180 2190 2200 

42S0 4260 4270 4260 4290 4300 4310 

TQa^GAAGGCCTACAGGGTGCCAGGCR C TTGTT^ fA ATOqGTT C ^ 

.:M::Mt : . m:.::mm: * * i * :i«M m :.:t«:st m. m t**: 
CACAGAAGGCAGAAGTGGTACCAGGCAAOA- - AGATGGGATTGTSKKyiTTTTGT- -TTTGTTTTTGAGAC 

2210 2220 2230 2240 22S0 2260 2270 

4320 4330 4340 43S0 4360 4370 4380 

TCTQCCTCCCCCavCTAGACTOTAAGCTCCCTGAAGOCAAGAATCCTa- -TGCTTA'TGCTCAATATTAQCT 
: : : M : , : t t . t . ti i t i : M m . m : : m : m ) . i . . . m ; 
TerGTCrCACTATGTAGTCCTGGCTOQCCrGQAACTc:^ 

2280 2290 230O 2310 2320 2330 2340 

4390 4400 4410 4420 4430 4440 

CTCCCTT- -GGCACAGAGT- - -AGGCACrrCAAauU^.-TGCTCX:a:AAAAGaCTaA6T^^ 

t. m:*m:.m i mm« «tM, Mt.t tt: : t Ms.. 

GGGTTTAA<XGCrCAaGGTCACATGCACAG<n'CAAGC^^ 

2350 2360 2370 2380 2390 240O 

44S0 4460 4470 4480 4490 4500 4510 

AAGTACCAGTGACATGGAaTAACTQa^AAGATAGATGAGCC^ 

.M«* t*.2«s :.: s .s; :.m ..st : .ctMS :.;.:.Mt m: 

TAOAT TAGOST-CTaCCTCXXXrCTAQ-TGaAGAGGCTCBlTCGCC^ -CTGATGCAGOACTC 

2410 2420 2430 2440 2450 2460 

4520 4530 4540 45S0 4560 4570 4580 

AATAAGTTGGAGACT - TCCCTAAAGGGTQGCATTTCCCCAGGCTAAOUVaSCAGAGCrcyiGGT^ 
« . . . : : . : . . r. : t m : . . M : : m : m M m : . m it,., : : t , i 

TGOTGTTTAQaCTCACTaVCrATTGGTTT-tXOTGGCACW -AATGTtCCTCTA 
2470 2480 2490 2500 2510 2520 2530 

4S90 4600 - 4610 

GOTaCGJVGGGOaWSQGGTOCAOAGGGGCnXSAOGC 

* . . M . . . . « }.....« t . t . t M t t M 

AAAGCTOAAAAAAAAAAAAAAAAAGGGCXaGCCGC 
2540 2550 2560 
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10 20 30 40 SO 60 70 

input 6 MSPPLCPLLLXJ^VGLRIiAGTliNPSDPrn^CSFWBSFTTTTKESHSRPPSLLPSEPCERPWEaPHTCPSPQT 



80 90 100 110 laO 130 140 

input 6 QftiaZ4ASRDSFC>IVCVGAGV0WRDRSAI^PQTGKALSMRPQPRVLSGAPSLASPGI^ 



ISO 160 170 180 190 200 210 

inputs CHGFyESROFCVPI^OECVHGRCVAPNQCQCVPOHRODDCSSAPKCLQPCT^ 



220 230 240 250 260 270 280 

inputs CDPQTGACFCPAERTGPSCDVSCSQGTSGFPCPSTHPCQNGGVFQTPQGSCSCPPaWMGTICSLPCPEGF 



290 300 310 320 330 340 3S0 

Inputs KGPKCSQECRCKNGGLCDRPTGQaiCAPGyTODRCaiEECPVGRFGOD^ 



360 370 380 390 400 410 420 

inputs HOFTGDRCTDRliCPDGFYOLSCQAPCTCDREHSLSiMPMNQECSCLPGHAGtJ^ 

^ - STHASG 

430 440 4S0 460 470 480 490 

inputs CLOiHGGVCQATSOIXXJCAPGYTaPHCU^IiCPPimGW 



500 510 520 530 S40 S50 560 

Input 8 SVPCPPOTHQFSCKASCCK^AHEAVCSPQTGACTCTPGMKOAKCQLPCPKGQFQEGCASRCD^ 

i I 

- --DP 

S7C 580 590 600 610 620 630 

input 6 VHGRCCK^AGWMOARCm^SCPEaLKGVNCSNTCrrCKNaGTCLPENGNCVa^G^^ 

t ; 1 , : , J t t 1 J 1 : . : J : : * : J I : . t : . ; J ; 1 J : J J I 1 1 1 i : » . : : : t I 5 ! t J ; 5 5 « » « ' * « • 5 • * « « * * * 
VHGQCRCQAGWMOTRCHLPCPEGFWGANCSNTCTCKNGGTCVSBKaNCVa^FRaPSCQRPCP^ 
XO 20 30 40 50 60 70 
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€40 650 660 670 680 690 

inputs RCVPCKCAN* HSFCKPSNGTCVCLAGVJTGPOCSQPCPFGHWGENCAQTCQOmGGTCMPQDaSCICPLGVl 

R<:n^0CKCJWNHSSCHPSIX3TCSCLAGWTGPDCSEACPPGHWGt,KCSQLCQCHH0^^ 

80 90 100 110 120 130 140 

700 710 720 730 740 7S0 760 

i npu t £ TGKKCLEOCPUSTFGANCSQPCOCGPGEXCHPETQACVCPPGHSGAPCRXGZQEPFTVMPTTPVA^^ 

: : . : i : t : : : : s . s s : t : t M t s . . : s ) . : i . t : : . t : « 1 1 a 

TGPNCLEGCPPRMFGVNCS0trCQa)liQEMCTPETGACVCPPOHSQM}CKMGSQES^ 
150. 160 170 180 190 200 210 

770 700 790 800 810 820 830 

inputs AVIGIAVUJSLVVALVALFIGYRHWOKGKEHHHLAVAYSSQRLrXSSEYVMPDVPPSYSHY^ 

: : : : i : : s : . : : t : : . : : t I : : : . t 2 s ; : : } : s i : : : : « t : : : 1 1 . : : t i t t . t i : : 1 1 : i 1 1 : : t : i { 
AVXGIAVU3TLWJd.XAIiFZOyRQHOKGKBHEHlAVAYSTORl4DOSDY^ 
220 230 240 250 260 270 280 

840 850 860 870 880 890 900 

inputs QGSPNPPPPNKVPGP-LFASLQNPERPGGAQGHDNHTTLPADWIOnmEPPPGPIiDRGSSRLDRSySYSYS 
: 2 t t t t : t . : . : . . m t : i ; t : t : : ; : : : :tt. tt: 

QCSPNPPPPMKVPGSQLFVSSQAPCRPSIUaiGREMHTTLPADHKHXmEPK DRQASHIiDRSYSCSYS 

290 300 310 320 330 340 350 

910 920 930 940 950 960 970 

inputs --K6POPFYDK6XiZ5E€El/3A5VA6tiSS£3lP¥ATI^LPd]liPGOPRESi9yMS^ 

* it : : ! : it::: ::::;:::::::::::::: : : $ r« i « : t : : : : t t . : 1 1 . \ * * i 
HRNGPGPFCHKGPISEEGt/SASVMSIiSSENPYATIRDLPSLPOEPRESGYVEMKGPPSVSPPRQSLHlJ^ 
360 370 360 390 400 410 420 

960 990 1000 1010 1020 1030 1040 

Inputs SQimRQPQPQRDSGTYEQPSPLiIHDRDSVGSQPPX^PPGLPPGKYDSPXNSBXPGKYDLPPVRHPPSPPIiR 
:.; : ttttttininttt t. «tttMitii:tt:;ti>sti*:ttttttiti<ttt : 

RQQR^QLQPQIU)SaTYEQPSPLSHMBESLGSTPPLPPGbPPaHYDSPiaiSHIPaim5LPPVRHPPSPPSR 
430 440 450 460 470 480 490 

lOSO 
inputs RQDR 
I : X t 
RQDR 
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GTCCGACCCACMCXSTCCaAGCCACACCCrGAAGGTGGTT^^ 79 

CCAGAAOlQCyVTCrGOCTTCCCAGACCCATQCTGOCCACCACr^ 158 

TGTTGTTGGOTGCCCTOTGGCAGGCTTGTGCy^TGCCACT 237 

TGGAAGACTCAACTCCAATGATCCCAATGTCTGTACCTTCTQGGAAAOC^ 316 

CGCCCCTTCAQCCrQCCCCCAGCCGAOTCCTGCGACAGOCCCrGCW 3 9 S 

TCTACCGGACTGTGTACOGTCAGGTGOTGAACU^TaaACnrCCa^ 474 

CAOTGGAGCCTGTQTCCCACTCTaTQCCCAGGAGTOTGTCCACX^ SS3 

CCAOGCTCKSasaQGTGACGACTGTTCCAaiXSAa^^ 632 

GTGOCAACAqCAOtTCCn^TGATCCCAGaAaTOaGOTOTOTT^^ 711 

GCCTTGCCCCQATGGCCACTATGGTCCnXSCCTOCCMT^^ 790 

OQAGCCn K riTC rQ CCCCCCAGGQAGAACAQQACCCAGGQCAC^ 869 

M O V I C S 6 

LPCP EOP HOP MCTQ ECRCHN 26 

CTG CCA TGC CCA GAO- GGT TTC CAC GGA CCC AAC TGT ACT. CAQ GAA TGT CGT TQC CAC AAT 1002 

aOti CDRF TO Q C HCA P GY I GD 46 
GGT GGC CTT TOT GAG AGG TTT ACT GOG CAG TGC CAC TGT GOT CCT GQC TAT ATC GOG GAT 1062 

RCREE CPVGR FOQDCABTCD 66 
CGG TGC CGT GAA GAG TGC CCT GTO GGC COC TTC GGT CAA GAC TGT OCT GAG ACC TOT GAC 1122 

CAPG ARCP PA MGACLCEHGP 86 
TOT OCT CCT GGC GCT CGT TGC TTT CCT GCC AAT GGC GCG TOT CTG TGC GAA CAT GGC TTC 1182 

TODRC TERL CPDGRYGLSCQ 106 
ACA GGC GAC CGC TGC ACT GAG CGA CTC TGT CCA GAT GGC CGC TAT GGT CTG AGC TGC CAA 1242 

PPCTCDPE HSL SCHPMH QEC 126 
GAT CCC TGC ACC TGC GAC CCA GAA CAC ACT CTC AGC TGC CAC CCA ATG CAC GGC GAG TGC 1302 

SCQP OWAGLHCMBS C PQDTH 146 
TCC TGC CAG CCA GGT TGG GCG GGC CTC CAC TGC AAC GAG AGC TGC OCT CAG GAG ACO CAC 1362 

QAGCQEHCLCLHG GVCL ADS l66 
GGA GCC GOT TGC CAG GAG CAC TQC CTC TGT CTG CAC GGC GGT GTT TGC CTC GCC GAC AGC 1422 

G UC RCAPGYTGP HCANLC. PP 186 
GGC CTC TGC- CGG TGT GCA OCT GGC TAC ACG GGA CCT GAC TGC GCT AAT CTT TGT CCA CCT 1482 

NT YG IM CSS HCSCEHAIACS 206 
AAC ACT TAT GGG ATC AAC TOT TCC TCC CAC TOO TCC TOT GAA AAT OCC ATT GCC TGC TCT 1S42 

PVDGTCICKEGWQRGMCSVP 226 
CCT GTC GAC GGC ACG TGC ATC TGC AAG GAA GGT TGG CAG CGT GOT AAC TOO TCT OTG CCC 1602 
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CPP aTWG PSCM ASCQCAHEG 246 
TOT CCC CCT GQC ACC TGG GOC TTC AGT TQC AAT OCC AGT TGC CAG TOT GCC CAC GAG GGA 1662 



V CSPQTOACTCTPGWRGVHC 266 

GTC TGC AGC CCC CAA ACT GGA GCC TGT ACTT TGC ACC CCT GGG TGG CGT GGG GTT CAC TQC 1722 

Q L PCPKGQPG E OC ASVCOC O 286 

CAA CTT CCQ TQC CCG AAG GGA CAO TTT GOT GAA GGT TGT GCC AGT GTC TGT QAC TOT GAC 1782 

HSD G CDPVHGHCRCQA GWMG 306 

CAC TCC GAT GGC TGT GAC CCT GTT CAT GGA CAC TGC CGA TGT CAG GCT GGC TGG ATG GGC 1842 

TiRCHL PCPEG PWGAN CSMA C 326 

ACA COT TGC CAC CTG CCT TQC CCA GAG GGC TTT TOG GGA GCC AAC TQC AGC AAT GCC TOT 1902 

T C K N G 6 T C V P E N 0 MC V C A P G 346 

ACC TGC AAG AAT GGT GGC ACT TGT GTA CCT GAG AAC GGC AAC TOT GTG TGC OCA CCA GGG 1962 

F R G P S C Q R P C P P G R Y 0 K R C V 366 

TTC AGA GGC CCC TCC TGC CAG AGO CCC TGC CCG CCT GGT GGC TAT GGC AAA CGC TGT GTG 2022 

P C K . C K M H S S C H P S D G T C S C L 386 

CCC TGC AAG TGC AAC AAC CAT TCT TCC TGC CAC CCG TCG GAT GGG ACC TGC TCC TGC CTG 2082 

A GKTGPDCSESCPPG HWG LK 406 

OCA GGC TGG ACA GGC CCT GAC TGC TCT GAA TCA TOT CCC CCA OGC CAC TGG GGA CTC AAA 2142 

CSQPCQCH R GAT CHPODGSC 426 

TGC TCC CAA OCC TOO CAG TGT CAT CAT GGT GCC ACC TGC CAC CCC CAG GAT GGG AGC TGT 2202 

VC IP OWTGPMCS EGCP SRMP 446 

GTC TGC ATC CCA GGC TGG ACT GGA CCC AAC TGC TCG GAA GGC TGC CCA TCA AGA ATG TTT 2262 

GVKCS OLCQCDPGE MCHPET 466 

GGT GTC AAC TGC TCC CAG CTA TGT CAG TGT GAT CCT GGA GAG ATG TGC CAC CCA GAG ACT 2322 

G ACVCPPG KS GAKCKV GSO E 486 

GGG OCT TGC GTC TGT CCC CCA GGA CAC AGT GGT OCG CAC TGC AAA GTG GGC AGC CAG GAG 2382 

S FT I MPTS PVI HNSLGAV I O S06 

TCC TTC ACC ATA ATG CCC ACC TCT CCT GTG ATC CAT AAC TCA CTG GGT GCC GTG ATT GGC 2442 

I A V L-G T L V VAL V A L P I G Y R H 526 

ATT OCA GTG CTG GGG ACC CTT GTG GTG GCC CTG OTA OCA CTG TTT ATT GGC TAC CGA CAC 2S02 

M Q K -nr KE HBHLAVAYSTGRLD 546 

TGG CAA AAG GGC AAG GAA CAT GAG CAC TTG GCA GTG GCT TAC AGC ACT GGG CGA CTG GAT 2562 

G S D Y V M P D V S P S Y S H Y Y S N P 566 

GGC TCC GAT TAC GTC ATG CCA GAT GTC TCT CCG AGC TAC AGT CAC TAC TAT TCC AAC CCT 2622 

SYHTIi SQ CSPKPPPPMKI PG 566 

AGC TAC CAC ACA CTG TCT CAG TGT TCT CCT AAC CCT CCA CCC CCT AAC AAG ATT CCA GGC 2682 

SQLPVSSQASBRPKRNHGRD 606 

AGT CAG CTG TTT GTC AGC TCC CAG GCA TCT GAG COG CCA AAC AGA AAC GAT GGG CGA GAT 2742 
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N H A T L P A D W K H R R E S H D R A F 626 
AAC CAC GCC ACA CTQ CCC GCT GAC TOO AAQ CAC CGA CGQ GAG TCC CAT OAC AGA OCT TTC 2802 

L a H Q P P Q P K V * 637 

CTC AGC3 CAC CAO CCA CCT GOA CCO AAQ GTA TAG 2835 

CTGTAOCTATOGCCACAOOAATOGCCCOaGGCCATTCTGTCATAAAGGTCCCATCTCTGAAG 2914 

OTTATGTCCCTGAOCAGTQAQAACCCCTATOCOACCATCCGAGACCTGCCaSGCCTGCCTGQ^ 2993 

GCTATOTGGAOATGAAAQaCCCTCCATCAGTOTCTCCCCCCAGGCAQCCTCTTCATCTCCGG^ 3072 

ACTQCAGTCTCAGAGAGACAGCGaCACCTATGAGCAGCCCACTCCCTTOAGCCGTAATOAAOAGTCT^ 3 151 

CCCCCTCTTCCTCCGGGCcraCCACCCGGCCACTATGACTCGCCCAAAAACAGCCACATCCC^ 3230 

CTCCAGTACGGCATCCTCCATCACCTCGATCCCaGCaCCAGGACCGCrGAGGAGCCAaCATaGTATC^ 33 09 

TGAACCCTQCCAOGAGCAC^KKrCTGGACCAGCAGOCCATQAATAGACAT^ 3388 

GCTCTOCTTCCACCGAGQQAGACACTAQTTOOCAAAOTGTCTAACCTCCCT^ 3467 

GCTGTGGACATOAOCTGaTGGGCAOAATOTTGTTOTTaAAOTCTaATTTTAGATT^ 3546 

AAAAAAAAAAAGGGCGGCCGC 3567 
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10 20 30 40 SO 60 

inputs C5TC-GACCavCOa3TCCacrCGJUlGa3GGGACCCTCaCCCC»TCCT<^ 

: : ! : i s :: 5 : t tx t . i . m i s . it it . i 

GTCCQACCCACGCGTCCO AOC- - CACaCCCIGAAOOTaOTTOOAAOa 

10 20 30 40 

70 80 90 100 110 120 130 

input 8 AGACCCCGOaXSTTCCTACCCCAGOCCGCAGGGGAQAaxn^CCCCAAOGC^ -TCCTQAA 

AGO QAACSGATCTAGGTCCTGAGCACTGG --AArTCCO^QAACAG-CATCrrGaCOT 

SO 60 70 80 90 100 

140 ISO 160 170 180 190 200 

inputs CGCTXSQ-QATCCCGCA-GGACATTCCCTOGCCCCCAGGCCCCAGGTCXX^^ 

: : • s . M : : : . : t . s tti 1 1 . t t.t t *itx M « i t t t « 1 1 1 1 

CCtaiTOCTOOCCACaiCrroATQTaTCCTT CCG0--*-ClG----CTGGCTaa\QTacn\3TTCTGTT 

110 120 130 140 150 160 

210 220 230 240 250 260 270 

inputs GOCAOGCCCCACCTOGCGTCTGCAATOTCACWGa:^^ 

i * : t it : i X t s : . , it i < 1 • j : : s : t : : : t t t ! : t t : : : t i . s x t t : $ : 
GTTGGGTGCCCI^TGGCA- -GOCTTCTdCAATGC^ 

170 180 190 200 210 220 230 

260 290 300 310 320 330 340 

inputs GGCTGGCTGGAAGTCra^CCCCAGTGATCXCAATACCTGCAGCT^ 

: : : i : : : J : 2 : t « i : t : : : : t s t s . : : : t 1 1 1 n t t i 1 1 % x x t i i x x t x ; : : : ; 

OTCroOCTGGtfACACraWVCTCCAATQATCXlCAAa^ 

240 250 260 270 280 290 300 

350 360 370 380 390 400 410 

input B CyU^GGAGTCCOVCTCCCXJCCCCTTCAGCCTOCTmiCTC^ 

::ti::i:::::::}t i i t x ttt ::s:sit ::t::::t::. 
TAAGGAGTaXACCrTCGCXX:crTCyvG<X^^ 

310 320 330 340 350 360 370 

420 430 440 450 460 470 

inputs CATAOTOC-CCCAQCCCACAAA- - -CT- -CAGA- - -OGAAACTCCTGQCT-TCTAGGGATTCATTCrrOC 
t : It iti I s ( } t : t t . . * : : t . : t x *x t t . i t t . : t , • . : . t : 

CACACCTOCGCTCAGCcrrACXKrrTaTcrrAcc^^ 

380 390 400 410 420 430 440 

480 490 500 SlO 520 530 ' 540 

Inputs 'A113^TCtG^TGtd3GG6CTQ-GAGT(^ 

* t tti t.t ti:: i,i t .ttttt t tt.:.tt tt: ..tt tt .t.t t x x 

CACXKXrra- - -a\OTGCrGTG<30GOTrACn:Aa3AGAGCAOTO^ TGC 

450 460 470 480 490 500 

S50 S60 ' 570 580 590 600 610 

inputs OClTTCTATGCGCCJCTCAGCCCy^GAOlXSTTGAGTO 

: . . . I t I : t • : . t 1 t : : * t t : : : : t s * . : : : . t : : 

CCAGa-AOTGTOTCCACGGTC- - - - - -GCrOTGTQ- -(KTrCCTAATCGOTGCCMIXyrQCACCAQaCTGO 

SlO 520 530 540 550 560 
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CGGGGTGAOGACraT TCC»OTG--AG-TGTGCr-CC-TGGAA"--tGTaOGGACCACAO TGT 

570 560 590 600 610 



690 700 710 720 730 740 750 

inputs GTCCCGCTCTGTGCCCAGOAOTGTGTCCATGOCCXJTTGTGTGGCACC^ -ATCAGTGCCAATG1X3TGCC 
— • :::::: : : . ::::!:,::::: : . : j 

GACAGOCTCTG CCTC- - -TGTGGCAACyioaVGTTCCTaTaATCCCAGaAGTGGGQTaTGTTTO 

620 630 640 6S0 660 670 6S0 

760 770 780 790 800 810 

inputs AGaCTGGCGQGG<X5ACaACT0TTCCAGTGCCCCXyu^CTGCCTTC^^ -TGQCTACTATG 
t i t t t i . t * t t 1 : t t ; I : : 1 2 t t : s : : ; t : : : : : : t : : : t t : : : : : 

CCTCTGGC- - - CTOCAG— CC CXTCCXSA-CTGCCTTCAOCCTTO- -CCCCGATQGCCACTATG 

690 700 710 720 730 

820 830 840 8S0 860 870 860 

inputs GCCXTOCXOTCCAGTTCCeCTGCCa«nX3^ 

1 1 1 I : t 1 1 1 t s s : 1 1 « i:tit :st i«>:(::t ati s: t:s:«:. : : t : : : s : : ; : t » t : 
OTCCTOCCTQCCAOTTTGATTOCCATTGCTATG^ 

740 750 760 770 780 790 800 

690 900 910 920 930 940 950 

inputs CCCCQ.(^GAGAGWVCTCG^ 

titt : : : . t i : : : 2 . 2 t . : t 1 1 : tjjjt: i t i t t i i i t x t i t x t t 

CCCCXXrAOGQAGAACAGGACCCAG- -GGCACTGATGGCTTCTTCTGCCCC 
810 820 830 840 850 

960 970 980 990 1000 1010 1020 

inputs AOCACCCATCCTTOCCAAAATOGAQOTCnCTTCauUVCa^ 

: : : ; i s : i : : s : t s : s : : t : : m t s t t t • . a s , t i . s : 1 1 2 s i x 1 1 t i : i t s ! 1 1 tit: 
ACyACTTATCCTTGCCAAAATGGAGGTGTTCCT 

860 870 860 890 900 910 920 

103O 1040 1050 1060 1070 1080 1090 

inputs GGATGGGCACXATCTGCTOrCTGfCCCIWCCAGAGGGCT^ 

t : : 1 t M . : $ i 1 ! t 1 3 1 : $ t : i t : : t : : : : : : : : s t t 2 t t ; : s . t ::::::::::: 

GGATGOGTGTa^TCTGTTCCCTOCCATGCCa^GAGGGTT^^ 

930 940 9S0 960 970 980 990 

1X00 1110 1120 1130 1140 IISO 1160 

inputs CTGaav^AAOKKXKKICTCroTaACC^ 

ttisxtaa t; t:tt3 titstt ttttis:istt:s*s:: ;tt:: s: it t tttttt 

TTGaaCAATGOTGOCOTTTOTOACAaOTTO 

1000 1010 1020 1030 1040 1050 1060 

1170 ^ 1160 1190 1200 1210 1220 1230 

inputs aK3TGCCQGaAQaA6TGCCXX3QTQGG(XlQC^^ 

ti:::::; :::::::: :::::t:;it2 22. :i :::::::::::> :: 2:2:: 2: t i : 

C»GT0<XXnWVQAGTGCCCTGTGGGCCGCrTCX3G^ 

1070 1080 1090 1100 1110 1120 1130 

1240 1250 1260 1270 1260 1290 1300 

inputs AC^3CCCX5TTGC^TCX:03GCXaU^a3GaK^^ 

. : : : : : : 2 i 2 : : a : : : : : .: : s t 2 * : : : : : 2 t : : : * ; t t 2 3 t t : : : t . : 2 : 2 t 2 : : t i i 2 2 22 
GOKTrcaTTGCTTTCCTQCCAATGOOGaST^^ 

1140 IISO 1160 1170 1180 1190 1200 
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1310 1320 1330 1340 13S0 1360 1370 

input 8 tCGCCrCTCCCCCGACOOCTTWACGGTCTCAOCrGCCAaOCCCCC^^ 

: I : t J t i li t t tit : • • t : t : j : : t t i i : : . : : t : t s t t : t :: t t 
GCOACrcrOTCCAGATGOCOQCrATGGTCTGAOCrOCC^^ 

1210 1220 1230 1240 12S0 1260 1270 

1380 1390 1400 1410 1420 1430 1440 

input 8 CrCAGCTGCCACCCGATGAACGGCXSAGTOCTCCTGCCnXJCCGGGC^^ 

::::::::: t t : t t : :::: t t : t * : : : t ::: t :::: t t :::: i : 

CrrCAGCTOCa^CCCTATGCACCWCGAOTGarCCTGCC^ 

1280 1290 1300 1310 1320 1330 1340 

14S0 1460 1470 1460 1490 1500 1510 

input S GCTGCCCGCyiGGACyvCOCATQGGCCAGGOTGCCAGQAGa^CTGTCrrC^ 

stti::: ::t::i:tstt }t«. ; :i tts:s:::j:>i!t t i i t : :t}2:t:: :: t: i i : x , 
GCTGCCCTCAGGACavCXXy^CGaAQCaSQTTGCCAO^ 

1350 1360 1370 1380 1390 1400 1410 

1S20 1530 1540 ISSO 1560 1570 1580 

inputs GGCTACX!AGCGGClCTCTGTCAGTGCGCGCCCK3GTTAC^ 

: t . : 1 1 : t ! 3 1 1 1 : t t . 1 1 ; : s « t t s t s 1 1 1 1 M t t m i s t 1 1 : 1 1 s . i t : i : i t : : • t : t 
aSCaSACAOCGGCCTCTQCmn^TOCACCTGGCTAaVC^^ 

1420 1430 1440 1450 1460 1470 1480 

1590 1600 1610 1620 1630 1640 1650 

inputs GACACXrrACXSGTGTaU^CrGTTCTGCACaCTXK^ 

. : : : ; t : i : « t i : t : t t ; : : t : . t : : : : t : t :: t : t :::::: : t i . i : . ; t : t : t 

AACACrrTATGGGATCAACTGTTCCTCCCACrrGCTCCTO 

1490 1500 1510 1S20 1S30 1540 1550 

1660 1670 1680 1690 1700 1710 1720 

inputs GCGAGTOOGTCTGCAAGGAACWTTGOCAGCGTGGTAAC^^ 

II. : : t s , 1 1 1 : 1 1 : : : 1 t : t s 1 : 1 1 t : ± s t : ; X « : ; 1 1 1 t t 1 1 « « 1 1 1 1 t : t a a : : : i : t : i 
GCAOQTQCATCTOCAAGGAAGGrrOGCAGCGTGGTAACTGC^^ 

1560 1570 1580 1590 1600 1610 1620 

1730 1740 1750 1760 1770 1780 1790 

input 8 CTTO^GTTGOVATGCOVGCTGCCAGTGTGCCCATGAt^^ 

t : i t : I : : 1 : X : : t t : : t i i : : ; : t t t : : i : 2 : t t i : t : i : t : i :::::: t t s : i t 
CTTCAGTOKyUlTGCCAGTTGayiGTGTGC^^ 

1630 1640 1650 1660 1670 1680 1690 

1800 1810 1820 1830 1840 1850 1860 

inputs TGCACCCCTGGOTGGaVTGGQaCCCACTQCXavCKrr^^ 

ittttttsittts:::.::::: :: ;:tstit.t.tttsi::t*t:tt:stts: 

TGCACCCCTGCKm^GOOTGGGOlTCACTdCC^ 

1700 1710 1720 1730 1740 1750 1760 

1870 . 1880 1890 1900 1910 1920 1930 

inputs cawjteKnxnxjACTOTGAoyvcTc^ 

: : : J : : t 1 1 t : t t t : t i 5 s i t i x t ::: t t ::::!:♦ J : i : . . : : 
CCAGTGTCn^TOACTQTOACXaVCTCCGATGa^ 

1770 1780 1790 1800 1810 1820 1830 

1940 1950 1960 ' 1970 1980 1990 2000 

inputs CrrGQATQQGTGCCCGCrGCXkCCTiJTCC^^ 

: : : J : s J : I . : : j :::::::::: t t i t s t xittt ,ittt 

CTGCSATGGGCACACGTTOCCACCTGCCTTaCCCAX^ 

1840 1850 1860 1870 1880 1890 1900 
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2010 2020 2030 2040 2050 2060 2070 

input 6 ACCTQCAAGAATGCKSGGCACCTGTCrCCCTGAGAATGGOU^CTGC^ 

ACCTGCAAGMT0GTGGCACTTGTGTACCTGAGAAaX3CAAC^ 

1910 1920 X930 1940 1950 1960 1970 

2080 2090 2100 2110 2120 2130 2140 

inputs CCTCCTGCCAGAGATCCTaTCAGCCrGGCCGCTATGGCAAACGCrGTXJTO 

: t 1 : t : : : : i : t : * tit: : : : t : : : : i ::::::::::::::::: t . : t : : : 
CCTCCTGCa^GAGGCCCrrGCCCXKlCrGGTCXSCrAT^^ 

1980 1990 2000 2010 2020 2030 2040 

2150 2160 2170 2X80 2190 2200 2210 

inputs CTCCTTCTQCXACCCCTCGAACGGGACCTGCrAC^ 

it t ::::ts::: sjj:::j;t: : f « : 1 1 : z : . 1 1 s M : s x : i : 1 1 : tztt;::i 

TTCTTCCroCCACCaSTCGGATGGGACCTGCTCCrrGCCTOG^ 

2050 2060 2070 2080 2O90 2100 2110 

2220 2230 2240 2250 2260 2270 2280 

Inputs CCATQCCXn'CCAGaACACK3GGGAGAAAACriX3TOCCCAGACC^ 

t t J » It : 1 s } : : t : : : s : : : . : : j : : ; : : . : t : :: t : . : t : t : : t : : : t : x s : j : : : 
TCATGTCCCCCAGaCCACrGGGGACTCAAATacrrCCCaACCCT^ 

2120 2130 2140 21S0 2160 2170 2180 

2290 2300 2310 2320 2330 2340 2350 

inputs ATCCCXaVGOATGOGAGCTGTJVTCrW^^ 

: : s t 2 : 1 1 t t : : : i : : : : : . : 1 1 i : st t : t : : : i s : t 1 1 1 1 t tattt * : : t : i i z t < i i « 
ACCCaaiGaATGGQAacrGTGTCTGCATCX:a^GGClXm 

2190 2200 2210 2220 2230 2240 2250 

2360 2370 2380 2390 2400 2410 2420 

inputs GGGGACATTTGGTGCTAACKKrrCCCAQCa^TGCCAQTC 

. . t . : . J I : : : : J t t t : : : t : t t t t ::::: t : t «: x t ; 

AAGAATGTTlWTdTCy^ACTGCTCCCAGCTATGTCAGTO^ 

2260 2270 2280 2290 2300 2310 2320 

2430 2440 2450 2460 2470 2480 2490 

inputs GGGGCCTGTGTATGTCCCCCAOGGCACAQTGGTGCACCT^ 

Mitt t : t : 1 1 : t t : t : t t : . : 1 1 1 1 s s 1 1 1 1 . t 1 1 t : . . • t t : : ; $ : : t i : 1 1 1 i a * 
GGOGCTTGC0TCTGTCCX:CCAGQACAa«3TO^^ 

2330 2340 2350 2360 2370 2380 2390 

2500 2510 2520 2530 2540 2550 2560 

inputs TQMQe<!X3JiCCKCrca^^ 

t . : t : i t $ t : . t : X 2 . : : . * i t : t t x : . t : : t : x t t : x : ': x x t : : i x x x : : : : i j ; t t x t . x : t : 
TAATQCCCACCrn:n'CCTCTaATCCATAACTC^ 

2400 2410 2420 2430 2440 2450 2460 

2570 - 2560 2S90 2600 2610 2620 2630 

inputs lX3TQGTAO<XCTaQTOGCAaK3TTCATTaOCTATCGQ^ 

. : : 1 1 : t * t X : t t 1 1 : . X X I 1 1 : X : t t 1 9 1 1 t t 1 1 * s « : : t x x t t t : * 1 1 t t t s 1 1 . i : t i 1 1 t t 

2470 2480 2490 2500 2510 2520 2530 

2640 2650 2660 2670 2680 2690 2700 

inputs 0CrrGTQ0CTTA<yvGCAGa3GG<»CCrGGACGGCTCC^ 

: ! « X t t X : X : : X X ; : X x : : x t : : : : : ::::::: x it ::::::::::::::: : : x : x : x : : x x : 
GCy^OTaG<OTACAQCAClX3GQCGACTGGATGGCTCa3A^ 

2540 2550 2560 2570 2580 2590 2600 
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2770 



2710 .2720 2730 2740 27S0 2760 

input 8 GTCACTACTACTCCyu^arCCAGCTACCACACCCTGTCQCAOTCICTCCCXa^ 

: : : t : : : : i : :;:;!::: :::::: 1 1 i : t t : i : : : t ! t t : i ::}::::::}::;• 
GTCyvCTACTATTCCAACCCTAOCTACCACAOVCTGTCra 

2610 2620 2630 2640 2650 2660 2670 

2780 2790 2800 2810 2820 2830 2840 

i npu C 8 GGTTCCAGGC - - • CCGCTCTrTGCCAGCCTOCAGAACCCTGAQaKSCCAG^ 

'* ' i i i i i i i ' i : : : 5 : : } i i : x i j i * M t i « M : i : : . . . : * . 
OATTCCAGGCAGTCAQCTOTTTQTCAGCTCCCAOGCATCTQAGCOGCO^ 

2680 2690 2700 2710 2720 2730 2740 

26SO 2860 2870 2880 2890 2900 2910 

inputs AACCACACCACCCTGCCl^CTGACTGGAAG<yvCa3CCGGGAaCCCC^ 

t i s t i t ,t i t t : : t : : t : : : t m i : t : t i t : i t : ; ; : t j : j : j j ; : . : t t • . ... 
AACCACGCCACACnXKrCCGCTGACTGGAAGCACCGACXK^ - -TTTCCTCAOGC 

2750 2760 2770 2780 2790 2800 

2920 2930 2940 29S0 2960 2970 

inputs AOCAOCCXSCCTGGACCaAAO- CTACaVGCTATAGCTACAOCaw^TQGCXJaVQGCCCATTC^^ 

• : s s t . J j 1 1 1 1 1 1 1 1 J t : ; • s x s s s s * ; t tat 1 1 s 1 1 i 1 1 . 1 1 « i : : : : s • : i s 

ACCAOCCAOrPGGACCGAAGQtATAGCrQTAOTAT^^ 
2810 2820 2830 2840 2850 2860 2870 



2980 2990 3000 3010 3020 3030 3040 

i t t i t ; : t : s ; : : : : i : « ; . . t : :% t t t : : : : * ::; t t t t:: t t : : t t t : t t 

AAGGTCCCATCTTOAAGAAGGACTAGGGGCAAGCGTTATGTCCCTQAGC^ 
2880 2890 2900 2910 2920 2930 2940 

3050 3060 3070 3080 3090 3100 3110 

inputs CATCCXKSQACCTGOCCAOCTTOCCAGOGGOCXICCCXS^ 

: ; ; 1 1 X . t 1 1 1 ; 1 1 : i . : t :stt.:::i. t t : s s * j t . 1 1 s i s : s : * » t s : : : : i : : s : t $ : : i i : 
CATCCQAGUVCCTGCCCQQCCnWCTGGGQAAaXJCmOAAAG^ 
2950 2960 2970 2980 2990 3000 3010 

3120 3130 3140 31S0 3ltiO 3170 3180 

input e TCy^GGATClX3CCCCCAGGCAGCCfCCTCAGTTTTGGGACAQC^ 

i i t t . : : t t : : : : t t : ; : t : 1 1 ; mi : ::::::: : : t : . i t . : : j : : : : : : . i i j j 
TCAOTGTCTCCCCCXavGGCAGCCTOTCATCTCCGGGACAGGCAG- - -CAGCAGCAACTGCAQTCTCAGA 
3020 3030 3040 305O 3060 3070 3080 

3190 3200 3210 3220 3230 3240 32S0 

inputs GAGACAGTGaCACCrACGAa<a^aCCCAQaXCCr^ 

GAGACAGCGGCACCTATGAGCAGCCCACTCanrraAaCCGTAATGAA 

3090 3100 3110 3120 3130 3140 3150 

3260 . 3270 3280 3290 3300 3310 3320 ' 

inputs TCTGCCTCCXSOGCCTAOXXCCGGCCACTAl^ 

ill : : : : s : : t 2 t ; . t i t : t i : : t : : t : : t s : t ; « : } i t i . t t t i : t t t : ? : : i i s « t t : : : : t t : : 
TCTTCCTCCXJGOCCTOCCACCCaGCCACTATGACTC^ 

3160 3170 3180 3190 3200 3210 3220 



3330 3340 3350 3360 3370 3360 3390 

inputs TTOCCTCCAQTACQGCATCCCCCATCACCTCCACTTCGACGCCAGC^ 

:::: 2 t :::::: i :::: i : : :::::::::::: : : . : s : : : $ m t : : : : : i : : : t : t t 
TTOCXrrCCAGTA<X30CyiTCCrCCATCy^CCTCCATCCCX3GaK^ 

3230 3240 32S0 3260 3270 3280 3290 
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3400 3410 3420 3430 3440 34S0 3460 

inputfi OGCAQAOGCCAGCACACCTGGCnYSTTGCTQCrrCAAGGC^^ 

; : : I : ::. M i i t : j t 

00- - GAG ---- AGTGCCT - GTG AACCC - TQCCAOGA 

3300 3310 3320 

3470 3480 3490 3500 3510 3520 3530 

i npu t s GCAGGGAGTGQACCGGCAGGCTGTGAACATG AACAACGCTTAACAGAOCAAGT^ 
J : : } : ; i j : : j t : { 

GCAGGOCCTGGACCAGCAGGC CATOAA TAGACATA 

3330 3340 33S0 

3540 3SS0 3560 3570 3580 3590 3600 

inputs TGGGTTCT'ACaVTGGGAGACGCTGATa^GCAGGATGCCTGGCrCCC^ 
t • t i s : 1 1 . 

CTTGO TGAA 

3360 

3610 3620 3630 3640 3650 3660 3670 

inputs CCTCCAGG0CCCTGTOTACATAAACn:GOTQ<k3TTGG^ 
t : i . : t . : . I : : : * i : , : t t 

— -^----OTOAACGGAGACTQ^-AGGATGG- - - 

3370 3380 

3680 3690 3700 3710 3720 3730 3740 

.inputs GTGGGQTACCTTTTCTGTQCATGCT<a.GCCTGGGCT 



CTCTQC-- 

3390 

3750 3760 3770 3780 3790 3800 3610 

inputs GTACaVOGCAGOTTCTGTOCTAGGGCACrXACCATT^ 

-TTCCA CCQAQGO - AGACACTA 

3400 3410 

3820 3830 3840 3850 3860 3670 ' 3880 

input S GOUlTAGCCrCCrAACTaGCCTCCTCCATTGATTCAGTGAACCTTCC;^ 



G ---- ----- TTGGC " 

3420 

3890 3900 3910 3920 ^930 3940 39S0 

inputs ATACAGa<OTGTTAaTTACTCXXn'ACCTOAAAaCCTTCATA^ 

: : : : 

— --AAAG - - 



3960 . 3970 3980 3990 4000 4010 4020 

inputs AAACrrTTGAAGGCCrTAAAGGCCCTGCT^^ 



-TGTCT- 
3430 



4030 4040 4050 4060 4070 4080 4090 

inputs GTTCCTOTCACTGCACXaca^OTCyvCACCXSGCCTCrr 

AACCTCC-- 
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4100 4110 4120 4130 4140 4iS0 4160 

inputs GCmCCrrGCACRCCTQQAGTGCCCrTCCTCCCCCAOTCQCC™ 

-CTTTTCC- - 

3440 

4170 4160 4190 4200 4210 4220 4230 

input s CTCCrCACXSOAAGTOCCCACCCrCCGTACATCnrTaVCAOCCCTGATTG^^ 

: M t : J : : t : , : : . : 

. — -.--------.,-.---AGCCC--ATTGCT-*----- ....CAAG 

34S0 

4240 4250 4260 4270 4280 4290 4300 

1 npu 1 6 TACCTGCAGAAGQCCrACAQQGTOCCAGQCACrrc m AATGQGtTCTTT 

: • ■ • 

T * 

3460 

4310 4320 4330 4340 43S0 4360 4370 

inputs AATCrcrrGCCTCCC<XyVCTAaACrGTAAGCTCCCrGAAaQC^ 

iitltS 

- CCCCCA - - 



4380 4390 4400 4410 4420 4430 4440 

inputs CTCTCCCTTGGaVCAGAGTAGGCACPOAaUUVTGCTCCCC^^ 



GGCTGTG-- 

3470 

44S0 4460 4470 4480 4490 4500 4510 

inputs ACXaVGTGACATOCAGTAACTGCrAAaATAGATaAacCATCT^ 
s 3 : t t s 

QACATG - 



4520 4530 4540 4550 4560 4570 4580 

inputs GTTGGA0ACTTCCCTAAAGaGTGGCATTTCCCCAGGaTAACAAO3a^ 



4590 4600 4610 4620 4630 4640 4650 

inputs AGa(ldeAaGGGTGCU^GGGOCrGAG<KrrGAGOGaGGTGCAaA^^ 

* s 1 1 1 : • t . 

~ .AG<7IX3GTGG-^ T ^ 

3480 

4660 - 4670 4680 4690 4700 4710 4720 

inputs TATACACWClATGCCTTaATTTATTGa^CTTCACAGQTAGCAGAAT^^ 

t i I : t : : t i 

- aavaAATGTT----GTTaTTaAAG 

3490 3500 

4730 4740 4750 4760 4770 4780 4790 

inputs ACATATATGTQACAOaATAGQTTAAaAAAAGCAAAaaiaACaAAATTGAA^ 
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4800 4810 4620 4830 4840 48S0 4660 

inputs TAAOCAAATCTGTTGOCACCATTTTTCCAATAQCATGTGCCCATT^^ 



------- - -TCTO- - -- -- -'.-•--*-(------*----»- - • - ATTTTAGAT* 

3Si0 3S20 

4870 4880 4890 4900 4910 4920 4930 

inputs AATTOCTTGCAATATTTCAAOa^TTTTCATTG^ 



4940 4950 4960 4970 4980 4990 SOOO 

inpu t fi TTOATATATTATTGTAATTaTTTOSGGaCGCCATQAACCGCACCC^^ 
: : t : * : « : : : 4 * . . i : . 

-TGATTTTTTAAAAAAAA-- ■ — 

3530 



5010 5020 5030 

inputs AAAAAAAAAAAAAAAAAAAGQGCGOCCG- 
t : : 1 1 s 1 1 1 1 : M t M : t i t : s i } t 1 1 1 
AAAAAAAAAAAAAAAAAAAGGGCGOCCQC 
3S40 3SS0 3560 
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10 20 30 40 50 

inpu t fi GTC - GACCCACGCGTCCGG TGACCCTGTTCATOGACAGT GCCGATGTCAGQ CTQGT 

: : : ::::::: t :::::: , . ; : : s : ; . . ♦ : : : . . : . : . , « : j . . ; tit, 

GTCCGACCCACGCGTCCQAGCCACACCCTGAAGGTGQTTGGAAOOAGGGAAGGATCTAGGTCCTGAGCAC 
10 20 30 40 50 60 70 

60 70 80 90 100 110 

inputs TGGATGGGCACA- CGCTGCCAC- - -CTGCCTTG-CCCGGA- -GO- -GCrTTTGGGGAG-CCAAC-TGCAQ 
i j: .1 a -J it:» * .j . t:. :.: ti,i 

TGGAATTCCCCTVGAAa^GCSVTCTGGCTTCCCAGACCCATGCT^ 

60 90 100 110 120 130 140 

120 130 140 ISO 160 170 

inputs -TAACACCTGTACC-TGCAAGAATGGTGGTACCTGTG- -TGTCT-GAGAATGGCAACTGCGTGTQCGCAC 
:.«:.:.::.: : : . . . . : $ : : : t i ; : . : t t s . : : . t $ i z t : i t t x t . : 
CXXSKSCTGCAGTGCTQTTCrGTTG 

ISO 160 170 180 190 200 



leO 190 200 210 220 230 

inputs CAO GGTTCCGAGGCCC-CTCCTGCa\GAGG<XCroCCCGCC--^ -OCT 

: « : : : t : t i t s : : t t • t . . • : i s : t x : s . t s : • : i : : • • t : : s 

CTCCrrCCl^3KK!CCTAOGCfCrGCX3TC^ 
210 220 230 240 250 260 270 

240 250 260 270 280 
inputs OTOT- -GCAATGC- AAOTOT- - -AACAACAA<X»TTCTTCCrrGCCACCCATC^ 



GGQAAAOCTTCyvCCACOACCACrAAOGAaTCCCACCTIWCCCCl^^ 
280 290 300 310 320 330 340 

290 300 310 320 330 

input 6 -OACGGGACXrrG- - - - - - -CTCCT-aCCTO- - -GCGGGCIX3-OAC3WIOC- -CCTOACTGC- -TCCQ- -AG 

: : : s i : : . • t t t : i * t t it t < s : i . : 1 1 t t 

CGACAQGCCerGGOAAGACGCCCACUVCCTOCGCTaVGCCT^ 
3S0 360 370 380 390 400 410 

340 350 360 370 

inputs GC ATQ---TCCC--CCAGGCCA CTGGGO ACT-CAAATGCT------CG 

• tit : : : : : s : s : t • t : : s $ tit;.:..::. : : 

QTGGTQAAOATGGACrCCCOCCCACGCCroCAGTGCTGTGGGGOTTACT^ 
420 430 440 450 460 470 480 

380 3^0 400 410 

inputs - -CAACTCTG- - -CCAQ -TGTCATCA TQ-GTGGGACCT GCCA CCCC- - - 

:::: ::::.:« :: ::::.$:: ^ * * ^ . i * ^ 

TCCCACTCrGTGCCaVGGAGTGTGTC'Ca^CXKn'CGCr^ 
490 500 510 520 S30 540 S50 
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420 430 440 450 460 
inputs CAGQATGGGAG CTGTATC TGCACGCCAGGCTGQACTGGACC - CAA CTGC 



CTGGCGGaQTGACGACTGTTCCAGTGAGTGTGCTCCTGGAATGTGGGGACCACAGTGTGACAQGCTCTGC 
560 570 S80 590 600 610 620 

470 480 490 SOO 

inputs TTGGAAGOCTGC CCA CCAAGAATGTTTGOTGT - - -CAACTGCTCC 

t . > s : : . . J : ; . J i t : j t : 

CTCTGTGGCAACAGCAGTTCCTGTGATCCCAGGAGTGGGGTGTOTTTTTaCCCCTCTGGCCTO 
630 640 650 660 670 660 690 

510 520 530 
inputs -AGCTATGTC- -AQTQ -TOATCT- CGGAaAGATQ TGC 



CCQACTOCCTTCAQCCTTGCCCCQATGQCCACTATGOTCCTOCCTOCCAGTTTQATTGCC^ 
700 710 720 730 740 750 760 

540 550 560 570 580 

inputs - -OVCCCAGAGAC— - - • - - -TGGaaCTTOTGTCTOTCCCCCAGG — -ACACAG- - - - -TGGTO 

;:;:.{.:!: t : $ t : s : : : : t : t > s : 3 : : . : : 

GGCATCCTGTGACCCCCGGGATGGAGCCTQCTTCTaCCCCCttGGGAGAACAGGACCCAaGaa 
770 780 790 800 810 820 830 

590 600 610 620 

inputs CAGAC - TGCAAAATOGGAAG CC--AGGAGTC-CTT- *CACCATAA- 

GCTTCTTCTGCCCCAGAACTTATCCTTGCCAAAATGGAGGTGrrCCT 
840 850 860 870 880 890 900 

630 640 650 
inputs -TGCCCACC- TOT CCCG TGACCCATAA CTC ^ - - -ACTGG 

CTGCCCACCGGGCTGGATGGGTGTCATCTGTTCCCTGCCaVTQCCCAGAGQGT^ 
910 920 930 940 950 960 970 

660 670 680 690 700 710 

inputs GTGCAGTaATTGGCATTGCAGTACTGGQAACCCTCGTG- - - -GTOaCCCTGATAG- - -CACTGTTCAT-T 
: : : : : . ; t t • j : t t . 1 : : : , . : : : : : : s : : i * : s : : : : : 1 t . : 

ACTCAG-GAATGTCGTTGCCACAATGaTGaCCTTTGTC 
980 990 1000 1010 1020 1030 1040 

720 730 740 

inputs GGCTA CCG -CCAOTGG CAAAA- -GGGCAAOGAACA 

1 J t 1 1 s t J ::.!:!: : : : . s j : : . , . J j t ♦ 

OOCTATATCGGGGATCGGTGCCGTGAAGAQTQCCCrrGTGQGCCaCTTC^^ 
1050 1060 1070 1080 1090 1100 1110 

750 760 770 780 790 

inputs TGAaCACTTGaCA---GTGGCTTAC AGCACTGGQCXSa- -CTGG-ATGGCTCTGATTA 

: : . : 1 . ; ; s : : . i : : 1 j : . : ♦ j : « : : 1 : ' • • : 1 : 1 J t . . • 

GTGACTGTQCTCCTGGCGCTCraTTGCTTTCCTGCCyO^ 
1120 1130 1140 1150 1160 1170 1180 
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80Q 810 620 830 840 650 

inputs COTCA- -TOC-CAGAT-GTCTCT- -CCQA ^^^-GCTATAGTCACTACTACT - -CCAACCCCAGC 

CGACCGCTGCACTGAGCWACTCTGTCCAQATGQCCGCTATGQTCTGAGC^ 
1190 1200 1210 1220 1230 1240 12S0 

860 870 880 890 900 
inputs TACC- -ACACACTGTCTCAGTGTTCTCCTAACCCCCCGC CCCCTAACA AGGTCC- -CAQGCA 

GACCCAGAACACAGTCTCAGCroCCACCCAATGCACGGCGAaT^ 
1260 1270 1280 1290 1300 1310 1320 

910 920 930 940 950 

inputs G- -TCAGCr-CTTTGTCAGCTCTCAGGCC-C- - -CTGAGC GGCCA- -AGCAGAGCC CA 

. 2 . t I : t :::::: t : : : ; : x i t t : : : t « : : : : t 

TCCACTGCAACGAOAGCTGCCCTCAGGACACOCACGGAGCCGGTTOCC^ 
1330 1340 1350 1360 1370 1380 1390 

960 970 980 990 lOOO 1010 

inputs GOGG<MTQAGAACCaVTACCACACTOC-^CCGCrOACTGGAM CGGGAGCCG C 

: : t i t t , . * : : . . : s . it.:: t : at i t . . 1 1 1 : t tt s t 1 1 s 3 t : 
CXK3CGGTGTTTGCCTaK:CG-Aa^aCGGCerCTQCCX^ 

1400 1410 1420 1430 1440 1450 1460 

1020 1030 1040 1050 . 1060 

inputs ATGACAGAGGC-GCCAQCCAC - -CTOGACCQAA-GCTATAQCTGTA GCTATAQCC 

.•••:«. 1 t t t ;:: .t; :: :s.::: 

TAATCTTTGTCCACCTAACACTTATGGGATC^ 

1470 1480 1490 1500 ISIO 1S20 1530 

1070 1080 1090 1100 1110 

inputs A CAGG-AATGGCCCAQG- -AC- -CATT -CTQTCATAAAGGTCCCATCTCTGAA OA* 

; . : . : : t : : . : . : : t . : t s ) . . s t : : . : : t i : . . s . 

TGCTCTCCTGTCGACGGCAaSTGCATCTGCTAOGAAGGTT^^ 

1540 1550 1560 1570 1580 1590 1600 

1120 1130 1140 IISO 1160 

inputs GGGACrAGGGGCyVAGOSTTA-TGTTCCTGA-GCAGTGAGAACCC-CT^ TGCTACC- - - 

X t : s : : : . . . : : J . : . it . j . : . . s ; : : : j t : . « s : 

CCCCTGGCACCTXSOGGCTTCAGTTGO^TGCC^^ 

1610 1620 1630 1640 1650 1660 1670 

1170 1180 1190 1200 1210 

input 8 - ATCCGAGACCTG CCCAGCCTGCC - TGGGGAAC CC- CGAG - - - AAAGTGGCT 

J . : J . . - : : : : : j : . : : t : J i : j . : • : : s s ♦ : . 

AAACTGGAGCCTGTACTTGCACCCCTGGGTOGCGTGGGGTO 
1660 1690 1700 1710 1720 1730 1740 

1220 1230 1240 1250 1260 

inputs ATGTGGAGATGAAAQGACC- • - -TCCAT- -CAGTGTCCCCTCCCA-GGCAGT CTCTTCAT C 

. : : s t . 1 : , s . . . : • J : : s ♦ J J : : : i : : t ; : . : : : t : J : J s : 

GTTTQGTGAAGGTTGTQCaVGTGTCIGTGACriQTGACCACT^ 
1750 1760 1770 1780 1790 1800 1810 
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1270 1280 1290 1300 1310 

inputs T-CCGG-GACAGaCAG-CAQ COG- - -CAACTGC- - AGCCACAGAGGG- - ACAQCGGCACC 

! s * i • 5 • ' s ' * s * s 5 • ! : : : t : : : ! . i : t t i : : : : : 

TGGCqATGTCAGGCTOGCTGGATGGGCACACGT^^ - CAGAGGQCTTTTGGGGAaCC 

1820 1830 1840 1850 1860 1870 1880 

1320 1330 1340 1350 

inputs TA-TG-AGCA--GCC CAGC CCCTTQAG- -CCATAATGAAGAGTCTTTGGa 

*i a ttt i Hi : . : t : : « : . j : : j ; 

AA(n<3CAGCAATQCCTOTACCrn3CAAGAATGGTGQCACTTO 

1890 1900 1910 1920 1930 1940 1950 

1360 1370 1380 1390 1400 

inputs CTCCA C----GCCCCCGCTTCCTCCA0QCXrroCC-TC(nX3aT^ 

' ' ' i i i 1 J : ; : : : ! i . ♦ t : * i J : i 1 1 1 t . : : : 

CAca^GGaTTa^GAGGccccTccrGccAGAaGCCCTGCca3c:^^ 

I960 1970 1980 1990 i2000 2010 2020 

1410 1420 1430 1440 1450 

inputs C- *CaWVG- - -AACAGCCATA-TCCCTO- * — - -GAC- ACTATQACTTQCCT* -C- — CAQTAC- 

i «8S! s :.:::!. 1 1 1 M t.s , ,t xitt: ; :.t :t 

CTGC»AGTGaVA<»ACCATTCTTCCTOC»CCCQTC^ 

2030 2040 2050 2060 2070 2080 2090 

1460 1470 . 1480 

inputs GGC---ATC — CTC --CAT CCCCT--CCA TCCCGGC GCCAO-GAC 

: : : . . : : : : : ; t t t ; : : i . i 

GGCCCnxSACTOCTCTGAATCATGTCCCCCAGGCCACTGGGGACrCAAATO 

2100 2110 2120 2130 2140 21S0 2160 

1490 1500 ISIO 1520 1530 1540 

inputs CGC-TGAAGA-GCCGGCAT aaTATGGGAGC-GTGCCTATGTACCTTGC- - - -CAGGA G 

J : : * ♦ : . t J « « . * J : J : : t t : : : J : t j . . ; : : . : : : . s t t j 

ATCATQGTGCCACCTOCaVCCCCCAGGATGGGAGCrrGTGTCrGCA 

2170 2160 2190 2200 2210 2220 2230 

15S0 1560 1570 1S80 
inputs CAGGGACTG - -GACCAGCAGG CCACG - AACAGAAACA CTTGGTGAA 

CTCGGAAGGCIOeeCATXia^GAATGnTOGTGTC^ 

2240 2250 2260 2270 2280 2290 2300 

1590 1600 1610 1620 1630 

inputs GTGAAC AGAGACGGACTGTGGC-CCTGTQCTTC^ - - -CACCGAGGGAGACACT^ - - -AGTTGACA 

: s : : : : : « : : : : : : : : : t : t : . $ s * . : : : : : i : i , : : 

ATaTQCCACCa^GAGACTGGGGCrTGCWTCrGTCCCCCAGGAC^ 

2310 2320 2330 2340 23S0 2360 2370 

1640 1650 1660 1670 1680 1690 

inputs ---AAGTGTCTAAC-CCTCTTTTCCAACC-CAC---TGCTC----AAOTCCCrro ATAAGC-- 

: « i * ] i I • . : : : . , . : : ; : : : : « : : : : : it t : : : : : . : : : . . t s 
GCO^QGAGTCCrrTCACaVTAATGCCCACCTCTCCTGTGATCCaiTAA 

2380 2390 2400 2410 2420 2430 2440 
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1700 1710 1720 1730 1740 

input S TOGTGGGCAGAA TGTTGTTGTACAAOTG TGATTTTAG - - - ATCGATTTTTTTTTAAAGT - 



TGCAGTGCTGGGGACCCTTGTGGTGGCCCTOGTAGCACTGTT^ 

24S0 2460 2470 2480 2490 2500 2510 

1750 1760 1770 1780 1790 1600 1810 

inputs ATGTQTTGGGTAC - CTTTTCTGTG - - TGTATGCTCAQGCAOGCTOTGTOTOTCrrcrAGTTQGCTTTAaAG 
i » i * * . : * * t ; : : : . : t j : . j . j : : . t s tt : ; j , j , • . 

AAGGAAGATGAGCACrrrOGCAGTGGCTTACAGCACTGGGC^ 

2520 2530 2540 2550 2560 2570 2560 

1620 1830 X840 1850 I860 1870 

inputs OGAGTC- - - - - - AGGTATAGGTT<n«CCTT- -CTQCACT- - -TTCCA-TCT-TATCT- AGTAOTGAGCW 

i » > i t t it ti t t .: :*»s ;] :«::: .t. :t, ::. 

GATGTCTCTCOTAQCrrACAGTCACrACTATTCCAACCCTAGCrACCA 

2590 2600 2610 2620 2630 2640 2650 

1680 1890 1900 1910 1920 
inputs -CCAAQCTTAACrAGTTAGAGCTCCA — C CAGCAO* « - - -Ca^G-OCCCTAACTAC CTGCCTGC 



ACCCrCCACCCCCTAAaVAGATTCCAGGCAGTCAGCTGTTTGTa^ 

2660 2670 2680 2690 2700 2710 2720 

1930 1940 1950 I960 1970 
inputs CCTTCACC C-AGTAATCCTC-CATGTCTTTGCTOVGA-GGATTGCTCC-CCGA CTCT 

CAGAAACCATGGGCXSAGATAACOVCGCCACACTGCCCGCTGACTGGAA^ 

2730 2740 2750 2760 2770 2780 2790 

1980 1990 2000 2010 2020 
inputs GGTGTTGTCCTCCTQ- - - -GTACGCCTTGAC GGTCCTGCAGT^ -CT CC-C CTTCCCG 

AGAGCTTTCCTa^GGCACCAGCaCCTOGACCGAAGGTATAOC^ 

2800 2810 2820 2830 2840 2850 2860 

2030 2040 2050 2060 2070 2080 

inputs T CTTGCT - TCATT -CTTTCCCAGAATQAAQGCrrGTCrrGCCACCCTACr-TCCCAQCC^^ 

t . s s : : : : : . t * n ; * : « : it i : : . : j s m . J t : : . 

GGGCCSlTTCrOTCATAAAGGTCCaVTCTCTGAAGA^ 

2870 2880 2890 2900 2910 2920 2930 

2090 2100 2110 2120 2130 2140 

inputs A ttXJGCA- -CATCTAAGl?TdAGCC TTCerAAGTTACCCGTTGAaTCCracITOCCC^ 

: : . t : . t s : : . t t . t • s : t s s : : * . t . 1 1 t t 

AGAACCCCTATGCGACCATCCGAGACCTGCCCGGCCrGCCrGGGGAACCC03A 

2940 2950 2960 2970 2980 2990 3000 

2150 2160 2170 2180 2190 2200 

inputs CACATAT TCCA-CAQAA-CACCC3VCC CCavCMCTQCOTdlTAGCTACTCrcrTCTCC^^ 

t . . i • sat I : : • . : . : : s : : : t . : « x : • a : * * « t . . : • : • : i 

GATOAAAGQCCCrCCaiTCAGTQTCTCCCCCCAaacy^GGCrCl^ 

3010 3020 3030 3040 30S0 3060 3070 
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2210 2220 2230 2240 2250 2260 

inputs GTACCCACAQAAGGCAOAAGTGGTACCAGGCAAGAAGATGGGA- - -TTQTTGCATTTTOTTTTQTTTTTG 

CTGCAGTCTCAGAGAGACAGCGGCACCTAT-GAGCyVGCCCACTCCCTTGAGCCGTAATOAAOAQTCTO 
3080 3090 3100 3110 3120 3130 3140 

2270 2280 2290 2300 2310 2320 2330 

inputs AGACTCTGT-CTCACTATGTAGTCCTOGCTGOCCTa- -GAACTCAAGAQCTCTQCCTGCCTCTGCCTCTT 
. : m * : t I J X : t i * s tt i m : . : . : . • . : . . : : : t : . . . : . j : : . : . : 
GG-CrCCATOCCCCCTCT-TCCTCCGGGCCTQCCACCCGGCa^CTATGACTCGCCa^ 

3150 3160 3170 3180 3190 3200 3210 

2340 23Sb 2360 2370 2380 

inputs QAGTGCTQGGTTTA- - - - ACGGCT- -CAQGQTCACATOCA- - -CAGCTCAAOCTOCACT- - 

s . . • : i • : . : * ' : i s i : « : . . s : : s i ) : t • : t t 1 1 . : . : : 

CCOTSQAttCTATGACrraCCTCCAGTAaM^ 

3220 3230 3240 3250 3260 3270 3280 

2390 2400 2410 2420 

i nput 8 CCGA TGTGCTT TCCC CTGTTGCrAGATTAGCaTCTGCCTCCC 

M . * . : : J : t . : : : t . : , : i . . : « . ; . t . : s : : 

GGAGCCAGCATGGTATGGGAGAGTGCCTGTGAACCCTGCCAGGAG^ 

3290 3300 3310 3320 3330 3340 3350 

2430 2440 2450 2460 2470 

inputs CCTAGTGGAG - -AOGCTGA TCGC-CAaCT- -CTCTGATGCAGGACTCTGGT- - 

: s : : : . : j : : . : : : : 

ATAGACATACrTGGTGAAGTGAACGGAGACTGAGGATQGCrCTGCTC 

3360 3370 3380 3390 3400 3410 

2480 2490 2500 2510 

inputs GTTTAGGCTCA CTCy^CTATTGaTTTCCrrGGCACAGG - - -GTAGTCA CT 

: . . : : : : . : : : : : . : : « it**. t m 

GCavAAGTGTCTAACCrCCCTTTTCCAGCCCy^TTGCTCAA 
3420 3430 3440 3450 3460 3470 3480 

2S20 2530 2540 2SS0 2560 

inputs CAA----TAAATaTtCC— TCT AAAAGCTGAAAAAAAAAAAAAAAAAGG 

CAGAATQTT G TTGTTOAAGTCTGATTTTAGATTOAT^^ 
3490 3500 3510 3520 3530 3540 3550 



inputs GCGGCCGC 
t 1 $ M t : : 
GCGGCCGC 
3560 
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10 20 30 40 50 60 70 

inputs maparagfcplllllllglwaeipvsakpkgmtssqwfkiqhmqpspqacnsamkninkhtkrck^ 
^ lcfp£llll£vlwgpvcplhawpk^ 

10 20 30 40 50 60 

80 90 100 110 120 130 140 

inputs FLHEPFSSVAATCQTPKIACKNGDKNCHQSHGPVSLTMCKLTSGKYPNCRYKEKRQimSYWACKPPQKK 

FLHDSFQWAAVCDLLsivCKN^^ 

70 80 90 100 110 120 130 

ISO 

inputs DSQQFHLVPVHLDRVL 

DPP-YKLVPVHLDSIL 
140 150 



Figure 36 
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10 20 30 40 50 60 70 

inputs GTCGACCCACGCGTCCGGCTCCCAGCCCACCCCCAAACAGACACAGCGTAGCCCGGGCCAGCTCTTAAGG 

AT : GG 

80 • 90 100 110 120 130 140 

inputs AGTTCAGGAGTGAGAAGAGGCCCTCy^GAGATCTGACAGCCTAGGAGTGCGT^ 
^ : ; : i : : : : 

<X«6*— — — — — ' — — -.-CTA- TGCTT TCCTCTTCT 

10 20 

150 160 170 180 190 200 210 

inputs TGAGCAGGAGTCACyvGCACGAAGACOAGCGCAAAGCGACCCCTGCCCTCOlTCCT^ 

; ; : : is.: ♦ s : 

TTTACIG CTGC ^ TGGTT CTA 

• 30 40 

220 230 240 250 260 270 280 

inputs AGAGAGAaX3GCACCGGCCAGAGCAGGATTCTK«:cCCCTTCTGC 

. : . J : • : i : : . : : : : j : : : : . : : : : : : i : 

TCGG'- - - -^CCAGTG-- TGTCCACCTCA- -TGCTT- - - -<5GC-* — 

50 60 70 . 

290 300 310 320 330 340 350 

inputs CAGAGATCCCAGTCAGTGCCAAGCCCAAGGGCATGACCTCATCACAGTGGTTTAAA^ 
s , . : : : : s : * a ttti 

CTAAG C-GTCT CA CCAAGG-C TCAC--TGGTTTGAAATTCAGCATATACA 

80 90 100 110 

360 370 380 390 400 410 420 

inputs GCCCAGCCCTCAAGCATGOiACTCAGCCATGAAAAACATTAACAAGCA 

J : : : : : J : J . : : s : . : : s j . i s 

GCCAAGTCCTCT CCA- ATGCA — ACAGGGCAATGA 

120 130 140 150 

430 440 450 460 470 480 490 

inputs AACACCTTCCTGCAGGAGCCTTTCTCCAGTQTGGCCGCCACCTGCCAGACCCCCAAAATAGCCTG^ 

: : $ : : t 1 1 

GTGGCATCAAC AATTATGCC 

160 170 

500 510 520 530 540 550 560 

inputs ATGGCGATAAAAACTGCCACCAGAGCCACGGGCCCGTGTCCCTGACCATGTGTAAGCTCACCTCAGG^ 

: : ; at : s : : : : : . : : : : 

^ .-^-CAG CAC TGTAAGCA TCA A 

180 

570 560 • 590 600 610 620 630 

inputs GTATCCGAACTGCAGGTACAAAGAGAAGCGACAGAACAAGTCTTACGTAGT<3a:CTGTAAGCCTC 

.... t J 5 ! 5 ! i J : : . : m 

AATACCTTTCTGCATG-AC TCTTTC CAG 

190 200 210 

640 650 660 - 670 680 690 700 

inputs AAAAAGQACTCTCAGCAATTCCACCTGGTTCCTGTACACTTGGACAGAGTCCTTTAGGTTTC^ 

Figure 37A 
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AATGTGG- CTGCTGT -CTGT * -GATTTGCT— CAG- 

220 • 230 240 

710 720 730 740 750 760 770 
inputs CTTGCTCTTTGGCTGACCTTCAATTCCCTCTCCAGGACTCCGCAC 

^CAOTGTCTC--CAA-AAATC ^GTCG-^<XACAACTCICCA CCAGAGC— - — 

250 260 270 280 

780 790 800 810 820 830 840 
inputs CITCCCCTCATCTCTTGGGGCTHSTTCCTGGTTCAGCCTCTGCTG^^ 



TCAAAG CCTG-*TCAACAT-GACT— — GACTG CAGACTCACT 

290 300 310 320 

850 860 870 880 890 900 910 

inputs GCTGAGCTCTAGAGGGATGGCTTTTCATCTTTTTO 

TCAGGAAAG TATCCCCAG-^* - 

330 

920 930 940 950 960 970 980 

inputs GCAAGCTCAGGTCTGTGGGTTCCCTOGTCTATGCCATT<^ 

— TGCC GCTATAGTG 

340 350 



990 1000 1010 1020 1030 1040 1050 

inputs CAGCATGACAAGGAGAGGAAATAAATGGAAAGGGGGO^TATGGGATTTGTGGA 

CTGCT GC = C 

360 

1060 1070 1080 . 1090 1100 1110 1120 

inputs CTGAACTAGAAGTCTTCCCa^GCTeK3ACGTGGCAGlX3AGGT^ 

;.:•:!.:•- X s s J s * j : . . 

CAGTACAAAT— TCTTC * *• ^ ATTG 

370 

1130 1140 1150 1160 1170 1180 1190 

inputs ATACCACTTCATATTTGTATAGAATCCTCTAATCCCTTGTGACATAGACl^ 

. ! , . : : : J J : J : 

ij-TGCCT GTGACC -CCC CT CAG 

380. 390 

1200 1210 1220 1230 1240 1250 1260 

inputs CTTTATGGATGASGAAATTAAGGTmAQAA^ 

: t : t : t 

. AAGAGC 

400 

1270 1280 1290 1300 1310 1320 , 1330 

inputs CAQAACCTGGACTTGAACCTAGGTCTCCTTGCTCTAAATACAGTGTACCTTC 

, : : ; : ? a t : ! s : : 1 1 : 

GACC -,:.^CC CC— CTACAAGTTG 

410 420 

1340 1350 1360 1370 1380 1390 1400 

inputs AGAAAGAAGIKTACTCTTACAGAGGCAAGCGGTGAACTAGGTAAGAGTTCAC 

GTTC -CTGT - ACA * s-CTTAGATAGTATTCTCT- 

430 440 450 

1410 1420 1430 1440 1450 1460 1470 

Figure 37B 
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inputs CTGAAGAGCCAGfTACCCTGTGTTGGCTGCAATAAAGGTCATTACCTCTCTAG^^ 



X480 1490 
inputs AAAAAAAAAAAAAAAAAAAAAAAAAAA 

: ; 

. AA 
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240 



2S0 



260 



270 



280 



290 



300 



, : . : : : : : at : : :•:::::::!:: . : : : : . : t : * * : 

GGTGCTATGCTTTCCTCTTCTTTTACTGCTOCT^ 

10 20 30 40 50 60 70 

310 320 330 340 350 360 . 370 

CCCAAGGGCATGACCTCATCACAGTGGTTOAAAATT^ 

• • : « : : : : : • . : • : : i t • : : t : : : t : : : : • : s t : s : : . 

CCTAAGCGTCTCACCAAGGCTCACTGGTTTGAAATTC 

80 90 100 110 120 130 140 

380 390 400 410 420 430 440 

CAGCCATGAAAAACATTAACAAGCACACAAAACGGTGCTJUVGACCa^ 

ISO 160 170 180 190 200 210 

450 460 470 480 490 500 510 

CTCCAGTGTGXK:CGCa^CCTGCCAGACCCCCAAAATAGCCTGCAAC^ 

:: . : . : tt.i ::::::.:ts t ; :: 

CO^GAATGTCGCTGCTGTCTGTGATTTGCTOIGCAT^ 

220 230 240 250 260 270 280 

520 530 540 550 560 570 580 

GAGCCACGGOCCCGTGTCCCTOACCATGTGTAAGCTCACCTCAGGGAAGT^ 

{ . J J . . 1 : : : : ♦ : mt . . a : . . ! : t : i : : : : : : : J • . 

GAGCtCAAAGCCTGTCAACATGACTGACTGCAGACTCAGTTCAGGAAAGTATC 

290 300 310 320 330 340 350 

590 600 610 620 630 640 650 

G-AGAAGCGAaiGAACAAGTCTTACGTAGTGGCCTGTAAGCCTCCCCAGAAAAAG<^ 
. ^: ::j.::!s.! :::::: .s :: :::::.!. ::j : J : :-! 

GCTGCTGC -CCAGTACAAATTCTTCAiriX5TTGCCl^^ 

360 — 370 380 390 400 410 

660 670 680 

CACCTGGTTCCTGTACACTTCSQAa^GAGarCCa-^ 
: M j : :: ::. 

. AAGTTGGarrCCaOTACACTTAGATAGTAT^ ' 
420 430 440 450 



43,4% identity in 477 ad overlap; score: 746 

410 420 430 440 450 460 

GGTGCAAAG ACX:TCAACACCTTC"-CTGCACGAGCCTTTC--TCCAGTGTGGCC^ 

GGTGCTA-i^ 

10 20 30 40 50 60 70 
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470 480 490 500 510 520 530 

CC-------CCCAAMTAGCCTGCAAGAATGGCGATAAA-AACTK3CCACCAGAGCCACGGGCCCGTGTCC 

CCTAAGCGTCTCACCAAGGCTCACTCGT^ 

80 90 100 110 120 130 140 

540 550 560 570 580 590 600 

CTGACCATGTGTAAGCTCACCTCAGGGAAGTATCCGAACTGCAGGTACAAAGAGAAGCGACAGAACAAGT 



CAGGGCAATGAGTGGCA-TCAACAATTATG CCCAGCACTGTAAGCATCAAAATACCTTTCTGCATGA 

150 160 170 180 190 200 

610 620 630 640 650 660 

CTTACGTAGTGGCCTGTAAGCCTCCCCAGAAAAAGGACT-CTCAGCAAT-TCCACCTGGTTCCT^ 

CT- -CTTT CCAGAATGTGGCTGCTC'TCTG'i^ 

210 220 230 240 250 260 270 

670 680 690 700 710 720 730 

TTGGACAGAGTCCmAGGTTTCCAGACTGGCTT^ 

A ACTC---CCACCAGAGCTCAA^ - -CTCTCAACAl^ci^C-'^ 

280 290 300 310 320 

740 750 760 770 780 790 

CTCC-GCACa^CTCCC~--CTAav-CCCAGAGCATTCTCTTCCCCTCATCTC 

s I : : : : • : • ;::::**•::«. : : : : : : * : : : : : • : : : 

330 340 350 360 370 380 390 

800 810 820 830 840 850 

TG--GTTCAGCCTCTGCTGGGAGGCTGAAGCTGACACTCTGGTGAGCTGAGCTC 
. : . . : . , ; : : : : . . : : : . . : : : . : j . i : . . * . : . . 
AGAAGAGCGACCCCCCCTACyU^GTTGGTTCCTGT-ACACrTAGATAGTA 
400 410 420 430 440 450 



46.5% identity in 488 aa overlap; score: 709 

440 450 460 470 480 490 

TGCyiCGAGCCTTTCTCa^GTGTGGCCGCCACCTO--CCA-GACCCCa^^ 

..sj tti . : : ,i a .: : : : :ss...t, t: J 

TGCT-^ATGCTTTCercrrCTTTTAC^ 

10 20 30 40 50 60 70 

500 *5t0 520 530 540 550 560 

GATAAAAACVK:CACaVGAGC-CACGGGCCC6TGTCCCTOACa^ 
^ , : . * * t : : : : : : 5 • . 5 . : ; ; s . ! . : * : * • : : i : 

CTAAGCGTCT--aiCaUVGGCTCACTGGTTTGAAATTCAG--CAT^^ 

80 90 100 110 120 130 

570 580 590 600 610 620 630 

TCCGAA-CTGCAGGTACAAAGAGAAGCGACAGAACAAGTCTTACGTAGTGGCCTGTAAGCC^^ 

:..i::t •:::•:::••:}..: ::::: 5 a . :::::::: :: : 

TCCAATGCAAaiGG-QCAATGAGTGGCATC--AACAATT*ATGCCC^ 

140 150 . 160 170 180 

640 650 660 670 680 690 700 

AAAGGACTCTCAGCAATTCCACCTGGTTCCTGTACACTTGGACAGA 

AAATACC'n-rCTG^ 

190 200 210 220 230 240 250 
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_ I L... 

Xasninltt^SGFi domain X of 4. from 3 to 37: score -1.2, B * 0.S9 

* •>CdCnphQdlsddtCdsddelf geetGqClkCkpnvtGrrCdr .C)cpG 

+ G d+ 4.'fGqC+ * -^G+rC -j-C +G 

i^t272 3 ----HASC DP VHGQCR-CQAGWMGTRCHLpCPBO 31 

yyglp5gdpgqgC<-* 
^--J-g ♦ +c 

mT272 32 FWG A-KC 37 

aCGFi domain 1 of 4, from 37 to 67: score 19.2, E s 0.1 

* '->CapnnpCsngG tCvn cpggssdnf ggy tCeCppGdyy 1 sy tGIcrC< - 

C-H 4.* C+ngGtCv+ g C+C+pG G+ C 

mT272 37 CSUTCTCKNCOTCVSENG— • ^NCVCAPG PRGPSC 67 



mT272 - - 

PSLt domain X of 1^ from 10 to 67: score -^ZX.l, £ » 8*1 

*->WfitdkhiggrtSlGfnleyrirvcCd«nVYGegCnkFCrPrdDafoiH 
i- ♦+ +r* Ce G+ C +g«f 
mT272 10 --HGQCRCQAG-— — WMGTRCHLPCPEQPWGANCSNTCTCK NGG 47 

ytCd«nQnklCleGWkGeyC<-« 
+enGn C++a +0+ C 
m'P272 48 TCVSENGWCVCAPOPRGPSC 67 

IwiuLaia^MFi domain 2 of a, from 41 to 80: score -l.S, b 0. 63 

♦->CdCaphGsladdtCdsddelf geetGqClkCkpnvtGrrCdr .CkpG 
C+C 4-0 tC S € G C4> C p**.* G+ C xtC pG 

mT272 41 CTCKNGQ — TCVS— ENGNfCV-CAPQPRGPSCQRpCPPQ 74 

yyglpsadpg(agC<-* 
y + + C 
mT272 75 RY- — — GKR— C 80 

CGFt domain 2 of 4, from 80 to 110: score 11.8, E 1.9 

*->Capnnx5Cfing , OtCvntpggssdnf gflytCeCppGdyylsytGkrC< 
C + C+++ g tC C G +tG-|.4^ 

inT272 80 CVQC-KCUNNhSSCHPSDG-- -TCSCIAG- — — WTGPDC 110 



mT272 — - 

laj&tninjOT* domain 3 of 4. from 93 to 123: score 2S.6, B « 0*0012 

♦ ->CdCnphGalfiddtCdaddQltgeecGqClkckpnvtQrrC * drckpG 
C Ca-h-^ 'f+C'»-+ 4. G C+ C+ tG++C4"f c pG 

mT272 63 CKCNNNH--— .SSCHP SDGTCS-CIACWTOPDCeEACPPG 117 
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j.,.-4t.i ■-"'^ w ^-i'^ 'j'-i^- 1' «*•«'- ■■ 



yyglpsg<ipgq«C<-* 

*^gl C 

tnT272 118 KWGtr — KC 123 

BfiTi domain 3 of 4, from 123 to 1S3 : score 27.3, E « 0.00036 

* ->CapnnpCsngGtCvntpggssdnf ggytCeCppOdyylsytGkrCK- 
C+^gGtC++ g ♦Ci'C'^pG -^tO+^C 
mT272 123 CSQLCQCKHGOTCHPQOG-— — — SCICTPQ WTGPNC 153 



XRT272 « - 

lamittin.BGri domain 4 of 4. from X27 zo 172: score <*5.5. E « 1.4 

♦->CdCnphGalsddtCdsddelfgtetGqClkCkpnvtQrrC.drCkpG 
C-^C^** Q tC++ G C C p+ tGf+C ♦ C p 

mT272 127 CQCKHGG- TCHP— QOGSCI-CWGWTGPHCIEGCPPR 160 

yyglpsg . dpgqgC<- * 

mT272 161 MFG«VNCflQLC**QC 172 

txSFt domain 4 of 4, from 166 to 156: score 6.5« E « 5«8 

* - >CapnnpC«ngGcCvntpgga sdnf ggyliCeCppGdyy 1 fiytakrC< * 
C*+++ C* g C** g C4CppQ +G +C 

mT272 166 CSQXX:QCOriGEMCHPETG- -ACVCPFG- HSGADC 196 



m^272 - - 
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* - >cas>nfipCsnvGtCvntpsgascUi£g5y tcecppv»ayy isy t;Gja:t.< - 
ratT272 18 tECRCHMGGXiCDRPTG— QCHCAPC VIGDRC 48 



ratT272 

laminln^SGrt domain 1 of 11 r from 22 co 61: score 12.3, E » 0»038 

•->CdCnphGfilfiddtCd«ddelf gaetGqClkCkpnvtQrrC .drCkpG 
C C-^-e G Cd* ^COqC+ C p-*** G+rC4.*^-C G 

rat:T272 22 CRCHNGG LCDR«- — --FTGQCH-CAPGVIQDRCrEECPVG S5 

yyglpS9dpgqgC<- ♦ 

racr272 S6 rfg — qdc si 

SQFi domain 2 of 11, from 61 to 91: score 18*. 3 » E « 0.18 

♦ - >CaptinpCsngGcCvncpggfi fi dnf ggy tCeCppGdyy 1 ay tCkrC < - 
Ca+'f^. c g-i'+C + g C C 4-G +tG*rC 
rat:T272 61 CAETCDCAPGARCFPANG — ACLCEHG FTGDRC 91 



racT272 - - 

Iftminin^SOri domain 2 of 11, from 65 to 105: scoro 4«0, S •> 0.2 

* ->CdCnphQ5lsddtCdsddelf geetGqClkCJcpnvtGrrCdr . . Ocp 

CdC p + +0 + G+Cl C *'*'*tG+rC C 

ratT272 65 CDCAPGA— — RCrP— — — ANGACL-CEHGFTGDRCTErlCPO 98 

GyyglpsgdpgqgC<-* 
G ygl -^C 
ratt272 99 GRVGL SC 105 

BOVi domain 3 of 11 « from ICS to 137: score 4.1, £ ■ 9.6 

• "->CapnnpCsng . . GtCvncpggaadnf ggytCeCppGdyy IsytGkrC 

C++'*** C+ ++ C++ +g +C C*pG ♦+G +C 

ratT272 105 CQDPCTCOPEhsLSCHPMHQ -ECSCQPO— WAGLHC 137 



ratT272 

laaisiln^EOFs domain 3 of 11. from 109 to ISO: score 13. 1« E ■ 0.032 

* *->CdCnphGal8ddtCd8dd«l f geeCGqClkCkpnvtGrrCdr ^ CkpG 
C+C+p 8lS C++ 4.*G+C+ C+P+ 1-G +C+++C 

ratT272 109 CTCDPEKSLS-— CKP-- MHGECS-CQPGWAOLKOlEfiCP— 142 

yy glp c gdpgqgC< - * 
++ + g oc 

ratT272 143 ~QO— —THQAGC ISO 

ZQFi domain 4 of 11, from ISO to 180i score 27.7/e « 0. 00026 

♦ ->CapnnpC8ngGt:Cvn tpggfi sdaf ggytCeCppGdyylay tGkrC< - 
O^i..,.* c+^gG+C+ g . C+C+pG ytG++C 
ratT272 150 CQEHCLCr-HCGVClAOSG — LCRCAPG YTGPHC- 180 
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laadnin^eOFt domain 4 ol i1« from 154 to 193: score 8*4« ^ « 0.0d4 

*<»>CdCnphQfilfiddtCdsddelf ^eetGqClkOcpnvtGrrC * drCkpQ 
C C ♦he* C +a C"^ C p*i^tO+*C + C pt. 

racT272 154 CtiC-LHG GVCLA-— — OSGLCR-CAPGYTGPHCaNLCPPN 187 

yyglp8gdpgqgC< - • 
ratT272 188 TYGt- — MC 193 

eCFt domain 5 of 11, from 193 to 223: score 10.6, E » 2.5 

*->CapnnpCangOtCvnt:pggssdnfggycCftCDpGdyyIaycGJcrC<- 
C n C ++ g tC•<•C■^•^G 4-4- 
ratT272 193 CSSHCSCENAIACSPVDG TCICKEG •WQRGNC 223 



ratT272 - 

Iwainiiv^KGri domain S of 11, from 197 to 23Cs score 0.7, E » 0.4 

• ->CdCnphG8lsddtCdsddelf ge«tO<iClkCHpnvcGrrCdr . OcpQ 
CCi-+ C 4. +GC Cki"f 4 +C pG 

rafr272 197 CSCENAl ACSP -*-VDOTCl-CKEGWQRQMCSVi>CPPG 230 

yyglpsgdpgqgC-c- ♦ 
•♦.^•gf +c 
ratT272 231 TWGF- ^SC 236 

XOFt domain 6 of 11, from 236 to 266: score 11. 8» £ - 1,9 

*->C«panpC5ngGtCvntpgga8dnfggytCeCppGdyylaytGkrc<- 
C-f 1- C + + g C•M:'^pG 1* G *C 

ratT272 236 CNASCQCAKEGVCSPQTG ACTCTPG- WRGVHC 266 



rat'r272 - - 

laminln^SGFi domain 6 of 11, from 240 to 279: score -2.'2, £ • 0.73 

•*>CdCnphGcladdCCdsddelfgeetGqClkCkpnvcGrrCdr .CkpG 
C^C + G C ♦ tO+C C P+ G +C G 

ratT272 240 CQCAHEC- VCSP -QTGACT-CTPGWRGVHCQIipCPKG 273 

yyglp8gdpgqgC<*»* 

ratl'272 274 QFG— — BGC 279 

Z>0X<t domain 1 of 1« from 246 to 309: score -X9«4, E » 5,2 

♦->WstdkhiggrtelGfnleyrtrvtCdenVYaegCttkPCrPrdDafgK . 
* i*+++g+ t +++ C * ♦G«gC+ C+ H 

ratT272 246 <3VCSPOt<5XC1OTP<3WRGVHCQLPCPKGQFGEGCASVCOCD--- 287 

yt , Cd. eaQnklClcCWkO«yC<-* 
•^ -^-Cd* +Q '♦'GW+G C 

ratl'273 288 SDgO^VKaRCRCQAawUGTnC 309 

BOPt domain 7 of 11. from 279 to 309: acore 7.0, E = 5.3 

* - >CapnttpC fingG tCvntpggs cdnf ggy tCeCppGdyy 1 »ytGk5:C< -» 
Qa+ + Ctt^ C f '•'♦g G + G rC 
raCT272 279 CJISVCTCOHSDGCOPVHO— KCPCQAO — WMGTRC 309 
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*->CdCnphGslfiddtCdsddelfgeecGqClkCkpnvc(3rrCdr.CkpC3 
CdC* d Cd* <»-4.G*C4- C+ ^ *Gi*rC *C *G 

ratT272 283 CDCD-HS DGCDP — VHGKCR-CQACWMGTRCHtpCPEG 316 

yygipsgdpgqgC< - • 
+*g 4- 4.C 

raCT272 317 PWG — A-NC 322 

EOFi domain 8 ot 11« from 322 Co 352: score 17*3» E - 0.38 

*->CapntipCfingGtCvntpggaadn£ggytCeCppGdyylsytGkrC<- 
C+ + C+ngGtCv4. g CfC+pG +0+0 
rafr272 322 CSlJACTCKNGGTCVPENG — NCVCAPG FHGPSC 352 



ratT272 

laaiaiBwXOart domain 8 of IX, from 326 to 365) score -1.8. & • 0.67 

*->CdCnphaslsddcCdfiddelf geetCqClkCkpnvtGrrCdr . CkpG 
C+C t G tC ♦ e G Ct- C p*-^ G«^ C r+C pO 

rtttT272 326 CTCKNGG TCVP----.--ENQNCV-CM>GFRGPSCQR.pCPPO 359 

yyglpsgdpgggC<-* 
y + + c 
ratT272 360 RY GKR— C 365 

domain d of 11^ from 365 to 394: score 18.3« £ « 0.18 

♦ -XapnnpCsngGcCvn tpgga sdnf ggytCeCppGdyy l»ytGkrC< - 
C p C+n+ g CC C G +ta4.4.c 
ratT272 365 CVPC-KCNNHSSCMSDG— TCSOdAO ^WTGPDC 394 



ratT272 

Xaminin^CCnrt domain 9 of IX, from 368 to 407 1 scora 24*0^ £ " 0.0034 
*->CdCnphG8l8ddtCdsddalf gaetGqClkCkpnvtGrrC *drCkpG 
C Cn-fh* *C■♦'-^ * G C-h + cG++C++ C pG 

raCT272 368 CKCMMHS**' SCK?— «*^-*S0GTCS«CLAGWTGPDC8CSCPPQ 401 

yyglp3gdpgQgC<- ♦ 
^^ql C 
raCT272 402 HWGL -KC 407 

sari domain 10 of 11. from 407 CO 437: score 24.0, E « 0.0035 

*->CapnnpCangGtCvntpggofldn<ggytCeCppGdyyl«ytGk2:C<- 
C+^+f C++g*tC++ g '♦'C+C pO +tG*+c 
ratT272 407 CSQPCOCHHGATCHPQOG— SCVCIPG WTGPNC 437 



ratT272 - - 

laminln^tSTf domain 10 of 11, from 41X co 450: seore 6.5* S ^ 0.12 

*->CdCnphGalfiddtCdsddelfge«cG<iClkCkpnvcGrrCdrCkpOy 
C+C+1- * tC*'^ G C-t" C P4. cG+fC + 
ratT272 411 COCHttGA— — TCHP-- QOGSCV^CIPGWTGPNCSE 439 

yglp«gdpgqgC<-» 
g pa-^-f-i-g^+C 
ratT272 440 -GCFSRlirGVNC 450 
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BSFt domain 11 of 11, from 450 to 480$ score 8*7 ^ E « 3.7 

* - >CapnnpC sngG t Cvn cpaga cdinf ggy cCcCppGdyy 1 sy tGkrC < - 
C*.+4.-^ c+ g C++ g C+CppQ +C 

r6tt272 450 CSQLCQCDPQEMCH5ETG- — ACVCPPQ— — MSGAHC 480 



ratT272 - * . 

Xami&ln^SGFt domain 11 of 11* from 454 to 489: score -6,3, & » 1.7 

*->CdCnphaal6ddtCdsddelfg«*tGqiClJcCkpnvtOrrCdrCkp<Sy 
C+C+p G I- C++ atG'VC* C p+ +0 +C 
ratT272 454 CQCDP-G— *EMCHP — — .ETGACV-CPPGKSOAHC- — K 481 

yglpsgdpgqgC<** 
g + -f* 

ratT272 482 VGSQE-SFT— 48$ 



// 



Fie. 41 p 
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SEQUENCE LISTING 
<110> Millennium Pharmaceuticals, Inc. 

<120> MEMBRANE -ASSOCIATED AND SECRETED PROTEINS AND USES THEREOF 

<130> 7853-206-228 

<150> 09/345,464 
<151> 1999-06-30 

<160> 148 

<170> FastSEQ for Windows Version 3.0 

<210> 1 

<2X1> 3284 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (1222) . . . (1944) 
<400> 1 

gtcgacccac gcgtccgtta tgtaactata cattttccca gaaattttag tatatgatat 
gattttgttt tctttcatcc cttttcccaa gcagtttatt atgaaaattt tcaaacatac 
agcaatgttg agaaaatttt acagtaaatg cctataccca ttacctaaat tttaccatta 
acattttacc ctgctggcat tattgtgctt atccatctac gtatccctct ctcccttcat 
tggtgtattt ctaagtaaat tgtaggcctc agtacacttc cttctgaatt cttcagcatg 
cacaacagta ttatattcca tttttaaaag agcaattctt gatagattta tatagttttg 
taaaatgttc atatagagct acaaatttta tctttttgtt tcttattgta tgtctagggt 
cctgaagggg atgctggcat tgttgggata tcaggtccta aaggtcctat tggacacaga 
ggaaacactg gtccccttgg cagagaaggt ataataggcc caacaggtag aactggaccc 
agaggtgaaa agggctttag aggtgaaact ggtcctcaag gaccaagagg tcaaccaggg 
cctccaggtc cacctggagc accaggccca agaaagcaaa tggatatcaa tgctgctatt 
caagccttga ttgaatcaaa tactgcccta cagatggagg taacatatct ggttttattt 
atattggcac tgtctctcaa tataccaatt aaacagagaa aatttttgga ggccaaaatg 
tgacattatc tcaaagattg tatttaaaac agattgaaaa tgtgaaacca ttctcaagaa 
caaagtaagt gattttggta taattaaaca gaaatatatg cgtaggatgt tttgtaagga 
aaacatttaa atcaaaaatt tagtactgtt atttgtaagg aatttggtac tatccaagaa 
agtagttaaa tgaggttagc catgtttctt aaaatgagat atatatatta tcactactca 
tttatttaaa ctctaatgat tcaatgtgta atttaaaaaa cataatacag tagacatagc 
aattcttatg ttagcttgaa aactaaactt gcaaatgtga atttaacctc tttaaaagat 
taaggttatt aaagcataca catatgccta tgcttaaata taaactgttc tttacattct 
actcacaact tactacacat a atg gaa aca cat tct tct cct gcc ttg gcc 

Met Glu Thr His Ser Ser Pro Ala Leu Ala 
1 5 10 

cat gtt ggt cct cag gat ttt ttt gtt tat ata att ctt atg atg act 
His Val Gly Pro Gin Asp Phe Phe val Tyr lie lie Leu Met Met Thr 
IS 20 25 

tgg cag age tac cag aat act gaa gtg act tta att gac cac agt gaa 
Trp Gin Ser Tyr Gin Asn Thr Glu Val Thr Leu lie Asp His Ser Glu 

30 35 40 



1 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1251 



1299 



1347 
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gag ata ttc aaa acc ctg aac tac ctt age aat tta ttg cac age ate 1395 
Glu He Phe Lys Thr Leu Asn Tyr Leu Ser Asn Leu Leu His Ser He 
45 50 55 

aag aat cct ctt ggc aca cga gat aac cca gca cga ate tgc aaa gat 1443 
Lys Asn Pro Leu Gly Thr Arg Asp Asn Pro Ala Arg He Cys Lys Asp 
60 65 • • 70 

tta ctt aac tgt gaa caa aaa gta tea gat gga aaa tac tgg att gac 1491 
Leu Leu Asn Cys Glu Gin Lys Val Ser Asp Gly Lys Tyr Trp lie Asp 

75 80 85 90 

cca aat ctt ggc tgt cct tea gat gcc att gag gtt ttc tgc aat ttc 1539 
Pro Asn Leu Gly Cys Pro Ser Asp Ala lie Glu Val Phe Cys Asn Phe 
95 100 105 

agt get ggt ggc cag aca tgc tta cct cct gtt tct gta aca aag ttg 1587 
Ser Ala Gly Gly Gin Thr Cys Leu Pro Pro Val Ser Val Thr Lys Leu 
110 115 120 

gag ttt gga gtt ggg aaa gtc cag atg aac ttc ctt cat tta ctg agt 1635 
Glu Phe Gly Val Gly Lys Val Gin Met Asn Phe Leu His Leu Leu Ser 
125 130 135 

teg gaa gcc acc cat ate ate acc att cac tgt, eta aac acc cca agg 1683 
Ser Glu Ala Thr His He He Thr He His Cys Leu Asn Thr Pro Arg 
140 145 150 

tgg aca age aca caa aca agt ggc cca gga ttg cct att ggt ttc aag 1731 
Trp Thr Ser Thr Gin Thr Ser Gly Pro Gly Leu Pro He Gly Phe Lys 
155 160 165 170 

gga tgg aat ggc cag att ttt aaa gta aac act eta ctt gaa cct aaa 1779 
Gly Trp Asn Gly Gin He Phe Lys Val Asn Thr Leu Leu Glu Pro Lys 
175 180 185 

gtg Ctt tea gat gac tgc aag att caa gat ggc age tgg cat aag gca 1827 
Val Leu Ser Asp Asp Cys Lys He Gin Asp Gly Ser Trp His Lys Ala 
190 195 200 

aca ttt ctt ttt cac acc cag gaa cct aat caa ctt cca gtg att gaa 1875 
Thr Phe Leu Phe His Thr Gin Glu Pro Asn Gin Leu Pro Val He Glu 
205 210 215 

gta caa aaa ctt cct cat etc aaa act gaa cga aag tat tac att gac 1923 
Val Gin Lys Leu Pro His Leu Lys Thr Glu Arg Lys Tyr Tyr lie Asp 
220 225 230 

age agt tct gta tgc ttt ctg . taaagtctct gaattagtte egaattcagg 1974 
Ser Ser Ser Val Cys Phe Leu 
235 240 

ctgttggcca ggtaattgct gcagagggag aaataagaca gacagataca gtcattatga 2034 

aatgcatgta ataaagcatt ggctaaatct taaagaatct caggaagaac agacttcctc 2 094 

ctaagaagga gaaaaggeat ttttaaagga ctatgattga taaagtattt aattctttta 2154 

aaaattatat tcatctcagc tttcttagag aattccctag aactaaaaat ttataaatat 2214 

ggaattctte agggtatctt atatttttga ctgagtgcgt agtacecatt agacagctgg 22 74 

agatgcagag cactatggag caatactggc taatgcttcc agatgtgcac tgcttctgtc 2334 
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taaaaattac 
gtccccgatg 
tgtgttgctt 
tttcatcttt 
gttagtactt 
gaagtccttg 
taacctgtgc 
gttgtaaaaa 
gttgtgccat 
ttttgtaaag 
ctttatattt 
tgtacttaaa 
tccagaaaaa 
gttatcagag 
aaaagcttgt 
ctataataaa 



aagccacagt 
ggcatataca 
ggtgctcttt 
gtcattcttt 
attttaattt 
ttttatttta 
ttgtacgcaa 
ttattatagg 
actgttttta 
ttttaattca 
ctgctttgta 
tttagtgtta 
aaaaagtctt 
aaatattagt 
tcatccatta 
aatgtgcttt 



ctaatatgtc 
tcttagccgg 
cgaaaacaag 
aaaagtgtat 
gtttggtcac 
aaattctctt 
aagaattaga 
ccagctacat 
aagttcatga 
gcaaattttt 
gaaattatat 
gtactttaaa 
ttcccattta 
tcaatactga 
taaatatatc 
aaataaaaaa 



ttattttcca 
tgatacacta 
gtgcttatgg 
gtactggtta 
acacttaata 
tgtgtatttg 
tttctfctgtt 
ctagtagtag 
tcatctggaa 
tgaaattgct 
gttttgtagt 
atttttaatt 
aaataggctc 
aagaaaaata 
tttagccaca 
aaaaaaaaaa 



aaacactaag 
cctcttacgt 
ctttcataga 
catcaagata 
acacatgaaa 
gaatcaaagc 
cttgttttat 
gtttggggta 
tgatacttag 
gctgttttaa 
attcattgat 
taccagtctt 
agccagttca 
ttatacctct 
gcaaaccaca 
agggcggccg 



ctgtattcag 
gttgcctctt 
ctatttcctt 
tgttttggtt 
ctatttatgt 
cagcacattg 
tttttaaatt 
cagattgggg 
tgtatatata 
attataaaac 
tttctttcac 
taaagcaaca 
atgtcgcctt 
tggtatctag 
cttaacctat 



2394 
2454 
2514 
2574 
2634 
2694 
2754 
2814 
2874 
2934 
2994 
3054 
3114 
3174 
3234 
3284 



<210> 
<211> 
<212> 
<213> 



2 

241 
PRT 

Homo sapiens 





<400> 


2 


























Met 


Qlu 


Thr 


His 


Ser 


Ser 


Pro 


Ala 


Leu 


Ala 


His 


Val 


Gly 


Pro 


Gin 


Asp 


1 








5 










10 










15 




Phe 


Phe 


Val 


Tyr 


lie 


He 


Leu 


Met 


Met 


Thr 


Trp 


Gin 


Ser 


Tyr 


Gin 


Asn 








20 










25 










30 






Thr 


GlU 


Val 


Thr 


Leu 


He 


Asp 


His 


Ser 


Glu 


Glu 


He 


Phe 


Lys 


Thr 


Leu 






35 










40 










45 








Asn 


Tyr 


Leu 


Ser 


Asn 


Leu 


Leu 


His 


Ser 


He 


Lys 


Asn 


Pro 


Leu 


Gly 


Thr 




50 










55 










60 










Arg 


Asp 


Asn 


Pro 


Ala 


Arg 


He 


Cys 


Lys 


Asp 


Leu 


Leu 


Asn 


Cys 


Glu 


Gin 


65 










70 










75 










80 


Lys 


Val 


Ser 


Asp 


Gly 


Lys 


Xyr 


Trp 


He 


Asp 


Pro 


Asn 


Leu 


Gly 


Cys 


Pro 










85 










90 










95 




Ser 


Asp 


Ala 


lie 


Glu 


Val 


Phe 


Cys 


Asn 


Phe 


Ser 


Ala 


Gly 


Gly 


Gin 


Thr 








100 










105 










110 






Cys 


Leu 


Pro 


Pro 


Val 


Ser 


Val 


Thr 


Lys 


Leu 


Glu 


Phe 


Gly 


Val 


Gly 


Lys 






115 










120 










125 








Val 


Gin 


Met 


Asn 


Phe 


Leu 


His 


Leu 


Leu 


Ser 


Ser 


Glu 


Ala 


Thr 


His 


He 




130 










135 










140 










He 


Thr 


lie 


His 


Cys 


Leu 


Asn 


Thr 


Pro 


Arg 


Trp 


Thr 


Ser 


Thr 


Gin 


Thr 


145 










150 










155 










160 


Ser 


Gly 


Pro 


Gly Leu 


Pro 


He 


Gly 


Phe 


Lys 


Gly 


Trp 


Asn 


Gly 


Gin 


He 










165 










170 










175 




Phe 


Lys 


Val 


Asn 


Thr 


Leu 


Leu 


Glu 


Pro 


Lys 


Val 


Leu 


Ser 


Asp 


Asp 


Cys 








180 










185 










190 






Lys 


lie 


Oln 


Asp 


Gly 


Ser 


Trp 


His 


Lys 


Ala 


Thr 


Phe 


Leu 


Phe 


His 


Thr 






19S 










200 










205 








Gin 


Glu 


Pro 


Asn 


Gin 


Leu 


Pro 


Val 


lie 


Glu 


Val 


Qln 


Lys 


Leu 


Pro 


His 




210 










215 










220 










Leu 


Lys 


Thr 


Glu 


Arg 


Lys 


Tyr 


Tyr 


lie 


Asp 


Ser 


Ser 


Ser 


Val 


Cys 


Phe 


225 










230 










235 










240 



Leu 



<210> 3 
<211> 723 



3 
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<212> DNA 

<213> Homo sapiens 



<400 
atggaaacac 
ataattctta 
agtgaagaga 
cctcttggca 
aaagtatcag 
gaggttttct 
aagttggagt 
gccacccata 
agtggcccag 
actctacttg 
aaggcaacat 
aaacttcctc 
ctg 



> 3 
attcttctcc 
tgatgacttg 
tattcaaaac 
cacgagataa 
atggaaaata 
gcaatttcag 
ttggagttgg 
tcatcaccat 
gattgcctat 
aacctaaagt 
ttctttttca 
atctcaaaac 



tgccttggcc 
gcagagctac 
cctgaactac 
cccagcacga 
ctggattgac 
tgctggtggc 
gaaagtccag 
tcactgtcta 
tggtttcaag 
gctttcagat 
cacccaggaa 
tgaacgaaag 



catgttggtc 
cagaatactg 
cttagcaatt 
atctgcaaag 
ccaaatcttg 
cagacatgct 
atgaacttcc 
aacaccccaa 
ggatggaatg 
gactgcaaga 
cctaatcaac 
tattacattg 



ctcaggattt 
aagcgacttt 
tattgcacag 
atttacttaa 
gctgtccttc 
tacctcctgt 
ttcatttact 
ggtggacaag 
gccagatttt 
ttcaagatgg 
ttccagtgat 
acagcagttc 



ttttgtttat 
aattgaccac 
catcaagaat 
ctgtgaacaa 
agatgccatt 
ttctgtaaca 
gagttcggaa 
cacacaaaca 
taaagtaaac 
cagctggcat 
tgaagtacaa 
tgtatgcttt 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
723 



<210> 4 

<211> 3169 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> (57) 



(1568) 



<400> 4 

gtcgacccac gcgtccgcgc ccgctgagcc ccccgccgag gtccggacag gccgag atg 

Met 
1 



59 



acg ccg age ccc ctg ttg ctg etc ctg ctg ccg ccg ctg ctg ctg ggg 
Thr Pro Ser Pro Leu Leu Leu Leu Leu Leu Pro Pro Leu Leu Leu Gly 
5 10 15 



107 



gcc ttc ccg ccg gcc gcc gcc gcc cga ggc ccc cca aag atg gcg gac 
Ala Phe Pro Pro Ala Ala Ala Ala Arg Gly Pro Pro Lys Met Ala Asp 
20 25 30 



155 



aag gtg gtc cca egg cag gtg gcc egg ctg ggc cgc act gtg egg ctg 
Lys Val Val Pro Arg Gin Val Ala Arg Leu Gly Arg Thr Val Arg Leu 
35 40 45 



203 



cag tgc cca gtg gag ggg gac ccg ccg ccg ctg acc atg tgg acc aag 
Gin Cys Pro Val Glu Gly Asp Pro Pro Pro Leu Thr Met Trp Thr Lys 
50 55 60 65 



251 



gat ggc cgc acc ate cac age ggc tgg age cgc ttc cgc gtg ctg ccg 
Asp Gly Arg Thr lie His Ser Gly Trp Ser Arg Phe Arg Val Leu Pro 
70 75 80 



299 



cag ggg ctg aag gtg aag cag gtg gag egg gag gat gcc ggc gtg tac 
Gin Gly Leu Lys Val Lys Gin Val Glu Arg Glu Asp Ala Gly Val Tyr 
85 90 95 



347 



gtg tgc aag gcc acc aac ggc ttc ggc age ctg age gtc aac tac acc 
Val Cys Lys Ala Thr Asn Gly Phe Gly Ser Leu Ser Val Asn Tyr Thr 



395 



4 
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100 105 110 

etc gtc gtg ctg gat gac att age eca ggg aag gag age ctg ggg ccc 443 
Leu Val Val Leu Asp Asp lie Ser Pro Gly Lys Glu Ser Leu Gly Pro 
115 120 125 

gac age tec tct ggg ggt caa gag gac ccc gcc age cag cag tgg gca 491 
Asp Ser Ser Ser Gly Gly Gin Glu Asp pro Ala Ser Gin Gin Trp Ala 
130 135 140 145 

cga ccg cgc ttc aca cag ccc tec aag atg agg cgc egg gtg ate gca 539 
Arg Pro Arg Phe Thr Gin Pro Ser Lys Met Arg Arg Arg Val lie Ala 
ISO 155 160 

egg ccc gtg ggt age tec gtg egg etc aag tgc gtg gcc age ggg cac 587 
Arg Pro Val Gly Ser Ser Val Arg Leu Lys Cys Val Ala Ser Gly His 
165 17 0 175 

cet egg ccc gac ate acg tgg atg aag gac gac cag gcc ttg acg cgc 635 
Pro Arg Pro Asp He Thr Trp Met Lys Asp Asp Gin Ala Leu Thr Arg 
180 185 190 

cca gag gcc get gag ccc agg aag aag aag tgg aca ctg age ctg aag 683 
Pro Glu Ala Ala Glu Pro Arg Lys Lys Lys Trp Thr Leu Ser Leu Lys 
195 200 205 

aac ctg egg ccg gag gac age ggc aaa tac acc tgc cgc gtg teg aac 731 
Asn Leu Arg Pro Glu Asp Ser Gly Lys Tyr Thr Cys Arg Val Ser Asn 
210 215 220 225 

cgc gcg ggc gcc ate aac gcc acc tac aag gtg gat gtg ate cag egg 779 
Arg Ala Gly Ala He Asn Ala Thr. Tyr Lys Val Asp Val He Gin Arg 
230 . 235 240 

acc cgt tec aag ccc gtg etc aca ggc acg cac ccc gtg aac acg acg 827 
Thr Arg Ser Lys Pro Val Leu Thr Gly Thr His Pro Val Asn Thr Thr 
245 250 255 

gtg gac ttc ggg ggg acc acg tec ttc cag tgc aag gtg cgc age gac 875 
Val Asp Phe Gly Gly Thr Thr Ser Phe Gin Cys Lys Val Arg Ser Asp 
260 265 270 

gtg aag ccg gtg ate cag tgg ctg aag cgc gtg gag tac ggc gcc gag 923 
Val Lys Pro Val He Gin Trp Leu Lys Arg Val Glu Tyr Gly Ala Glu 
275 280 285 

ggc cgc cac aac tee acc ate gat gtg ggc ggc cag aag ttt gtg gtg 971 
Gly Arg His Asn Ser Thr He Asp Val Gly Gly Gin Lys Phe Val Val 
290 295 300 305 

ctg ccc acg ggt gac gtg tgg teg egg ccc gac ggc tec tac etc aat 1019 
Leu Pro Thr Gly Asp Val Trp Ser Arg Pro Asp Gly Ser Tyr Leu Asn 
310 315 320 

aag ctg etc ate ace cgt gcc cgc cag gac gat gcg ggc atg tac ate 1067 
Lys Leu Leu He Thr Arg Ala Arg Gin Asp Asp Ala Gly Met Tyr He 
325 330 335 



5 
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tgc ctt 9gc ^cc aac acc atg ggc tac age ttc cgc age gcc ttc etc 1115 
Cys Leu Gly Ala Asn Thr Met Gly Tyr Ser Phe Arg Ser Ala Phe Leu 
340 345 350 

acc gtg ctg cca gac cca aaa ccg cca ggg cca cct gtg gcc tec teg 1163 
Thr Val Leu Pro Asp Pro Lys Pro Pro Gly Pro Pro Val Ala Ser Ser 
355 360 365 

tec teg gcc act age ctg ccg tgg ccc gtg gtc ate ggc ate cca gee 1211 
Ser Ser Ala Thr Ser Leu Pro Trp Pro Val Val He Gly He Pro Ala 
370 375 380 385 

ggc get gtc ttc ate ctg ggc ace ctg etc ctg tgg ctt tgc cag gee 1259 
Gly Ala Val Phe lie Leu Gly Thr Leu Leu Leu Trp Leu Cys Gin Ala 
390 395 400 

cag aag aag ccg tgc ace ccc gcg cct gcc cct ccc ctg cct ggg cac 1307 
Gin Lys Lys Pro Cys Thr Pro Ala Pro Ala Pro Pro Leu Pro Gly His 
405 410 415 

cgc ccg ccg ggg acg gcc cgc gac cgc age gga gac aag gac ctt ccc 1355 
Arg Pro Pro Gly Thr Ala Arg Asp Arg Ser Gly Asp Lys Asp Leu Pro 
420 425 430 

teg ttg gcc gcc etc age get ggc cct ggt gtg ggg ctg tgt gag gag 1403 
Ser Leu Ala Ala Leu Ser Ala Gly Pro Gly Val Gly Leu Cys Glu Glu 
435 440 445 

cat ggg tct ccg gca gcc ccc cag cac tta ctg ggc cca ggc cca gtt 1451 
His Gly Ser Pro Ala Ala Pro Gin His Leu Leu Gly Pro Gly Pro Val 
450 455 460 465 

get ggc cct aag ttg tac ccc aaa etc tac aca gac ate cac aca cac 1499 
Ala Gly Pro Lys Leu Tyr Pro Lys Leu Tyr Thr Asp He His Thr His 
470 475 480 

aca cac aca cac tct cac aca cac tea . cac gtg gag ggc aag gtc cac 1547 
Thr His Thr His Ser His Thr His Ser His Val Glu Gly Lys Val His 
485 490 495 

cag cac ate cac tat cag tgc tagacggcac cgtatctgca gtgggcacgg 1598 
Gin His He His Tyr Gin Cys 
500 

gggggccggc cagacaggca gaetgggagg atggaggacg gagctgcaga egaaggcagg 1658 

ggacccatgg egaggaggaa tggecagcac cceaggcagt ctgtgtgtga ggcatagccc 1718 

ctggacacac acacacagae acacacactg ectggatgea tgtatgeaca cacatgcgeg 1778 

cacacgtgct ccctgaaggc acacgtacgc acacacgcac atgcacagat atgccgcctg 1838 

ggcaeacaga taagctgece aaatgcacgc acacgcacag agacatgeca gaacatacaa 1898 

ggacatgctg ectgaacata cacaegcaca cccatgcgca gatgtgctgc ctggacacac 1958 

acacacacac ggatatgctg tctggacgca cacacgtgca gatatggtat ccggacaeac 2 018 

acgtgcacag atatgctgcc tggacacaca gataatgctg ccttgacaca cacatgcacg 2078 

gatattgcct ggacacacac acacacacgt gtgcacagat atgctgtctg gacacgeaca 213 8 

cacatgcaga tatgctgcct ggacacacac ttccagacac acgtgcacag gcgcagatat 2198 

gctgcctgga caeacgcaga tatgetgtet agtcacacac acacgcagac atgctgtccg 2258 

gacacaeaca cgcatgcaca gatatgctgt ccggacaeac acacgeacgc agatatgctg 2318 

cetggacaca eacaeagata atgctgcctc aacactcaea cacgtgcaga tattgcctgg 237 8 

acacacacat gtgcacagat atgctgtctg gaeatgcaca cacgtgcaga tatgctgtcc 2438 
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ggatacacac 
tgcacacaca 
gtgtgccgtg 
ctgctccacc 
catccccgcc 
ggagtcccct 
ggagcccatg 
ttggtattta 
gcagggactg 
accacacccc 
tctgtaattt 
atgaaagtgc 
agggcggccg 



gcacgcacac 
ggtgcagata 
aagcctgcag 
gtcactcccc 
tctgtcccct 
actgctgtgg 
gctagtggct 
tatttaagaa 
tggtctctcc 
gtccaggcca 
tatgtagagt 
aaaaaaaaaa 
c 



atgcagatat 
tgctgcctgg 
tacgtgtgcc 
caactctgcc 
ggccttggcg 
gctggggttg 
catccccact 
atgaagataa 
tggggcccgg. 
gacaccaccc 
ttgagctgaa 
aaaaaaaaaa 



gctgcctggg 
acacacgcag 
gtgaggctca 
cgcctctgtc 
gctatttttg 
ggggcacagc 
gcattctccc 
tattaataat 
gacccgcctg 
cccaccccac 
gccccgtata 
aaaaaaaaaa 



cacacacttc 
actgacgtgc 
tagttgatga 
cccgcctcag 
ccacctgcct 
agccccaagc 
cctgacacag 
gatggaagga 
gtctttcagc 
tgtcgtggtg 
tttaatttat 
aaaaaaaaaa 



cggacacaca 
ttttgggagg 
gggactttcc 
tccccgcctc 
tgggtgccca 
ctgagaggct 
agaaggggcc 
agactgggtt 
catgctgatg 
gccccagatc 
tttgttaaac 
aaaaaaaaaa 



2498 
2558 
2618 
2678 
2738 
2798 
2858 
2918 
2978 
3038 
3098 
3158 
3169 



<210> 5 
<211> 504 
<212> PRT 

<213> Homo sapiens 





<400> 


5 
















Met 


Thr 


Pro 


Ser 


Pro Leu Leu Leu 


Leu Leu 


Leu Pro 


Pro 


Leu 


Leu 


Leu 


1 








5 


10 








15 




Gly 


Ala 


Phe 


Pro 


Pro Ala Ala Ala 


Ala Arg 


Gly Pro 


Pro 


Lys 


Met 


Ala 






20 




25 






30 






Asp 


Lys 


Val 


Val 


Pro Arg Gin Val 


Ala Arg 


Leu Gly 


Arg 


Thr 


Val 


Arg 






35 




40 






45 








Leu 


Gin 


Cys 


Pro 


Val Glu Gly Asp 


Pro Pro 


Pro Leu 


Thr 


Met 


Trp 


Thr 




50 






55 




60 










Lys 


Asp Gly Arg 


Thr lie His Ser Gly Trp 


Ser Arg 


Phe 


Arg 


Val 


Leu 


65 








70 




75 








80 


Pro 


Gin Gly Leu 


Lys Val Lys Gin 


Val Glu 


Arg Glu 


Asp 


Ala 


Gly 


Val 










85 


90 








95 




Tyr 


Val 


Cys 


Lys 


Ala Thr Asn Gly 


Phe Gly 


Ser Leu 


Ser 


Val 


Asn 


Tyr 






100 




105 






110 






Thr 


Leu 


Val 


Val 


Leu Asp Asp lie 


Ser Pro 


Gly Lys 


Glu 


Ser 


Leu 


Gly 






115 




120 






125 








Pro 


Asp 


Ser 


Ser 


Ser Gly Gly Gin Glu Asp 


Pro Ala 


Ser 


Gin 


Gin 


Trp 




130 






135 




140 










Ala 


Arg 


Pro 


Arg 


Phe Thr Gin Pro 


Ser Lys 


Met Arg 


Arg 


Arg 


Val 


He 


145 






150 




155 








160 


Ala 


Arg 


Pro 


Val 


Gly Ser Ser Val 


Arg Leu 


Lys Cys 


Val 


Ala 


Ser 


Gly 










165 


170 








175. 




His 


Pro 


Arg 


Pro 


Asp He Thr Trp 


Met Lys 


Asp Asp 


Gin 


Ala 


Leu 


Thr 








180 




185 






190 






Arg 


Pro 


Glu 


Ala 


Ala Glu Pro Arg 


Lys Lys 


Lys Trp 


Thr 


Leu 


Ser 


Leu 




195 




200 






205 








Lys 


Asn Leu Arg 


Pro Glu Asp Ser Gly Lys 


Tyr Thr 


Cys 


Arg 


Val 


Ser 


210 






215 




220 










Asn 


Arg Ala Gly 


Ala He Asn Ala 


Thr Tyr 


Lys Val 


Asp 


Val 


He 


Gin 


225 








230 




235 








240 


Arg 


Thr Arg 


Ser 


Lys Pro Val Leu 


Thr Gly 


Thr His 


Pro 


Val 


Asn 


Thr 








245 


250 








255 




Thr 


Val 


Asp 


Phe 


Gly Gly Thr Thr 


Ser Phe 


Gin Cys 


Lys 


Val 


Arg 


Ser 








260 




265 






270 






Asp 


Val 


Lys 


Pro 


Val He Gin Trp 


Leu Lys 


Arg Val 


Glu 


Tyr 


Gly 


Ala 




275 




280 






285 








GlU 


Gly Arg 


His 


Asn Ser Thr He 


Asp Val 


Gly Gly 


Gin 


Lys 


Phe 


Val 




290 






295 




300 
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Val 


Leu 


Pro 


Thr 


Gly 


Asp 


Val 


Trp 


Ser 


Arg 


Pro 


Asp 


Gly 


Ser 


Tyr 


Leu 


305 










310 










315 










320 


Asn 


Lys 


Leu 


Leu 


lie 


Thr 


Arg 


Ala 


Arg 


Gin 


Asp 


Asp 


Ala 


Gly 


Met 


Tyr 










325 










330 










335 




lie 


Cys 


Leu 


Gly 


Ala 


Asn 


Thr 


Met 


Gly 


Tyr 


Ser 


Phe 


Arg 


Ser 


Ala 


Phe 








340 










345 










350 






Leu 


Thr 


Val 


Leu 


Pro 


Asp 


Pro 


Lys 


Pro 


Pro 


Gly 


Pro 


Pro 


Val 


Ala 


Ser 






355 










360 










365 








Ser 


Ser 


Ser 


Ala 


Thr 


Ser 


Leu 


Pro 


Trp 


Pro" 


vai 


Val 


Xle 


Gly 


He 


Pro 




370 










375 










380 










Ala 


Gly 


Ala 


val 


Phe 


lie 


Leu 


Gly 


Thr 


Leu 


Leu 


Leu 


Trp 


Leu 


Cys 


Gin 


385 










3 90 










395 










400 


Ala 


Gin 


Lys 


Lys 


Pro 


Cys 


Thr 


Pro 


Ala 


Pro 


Ala 


Pro 


Pro 


Leu 


Pro 


Gly 










405 










410 










415 




His 


Arg 


Pro 


Pro 


Gly 


Thr 


Ala 


Arg 


Asp 


Arg 


Ser 


Gly 


Asp 


Lys 


Asp 


Leu 








420 










425 










430 






Pro 


Ser 


Leu 


Ala 


Ala 


Leu 


Ser 


Ala 


Gly 


Pro 


Gly 


Val 


Gly 


Leu 


Cys 


Glu 






435 










440 










445 








Glu 


His 


Gly 


Ser 


Pro 


Ala 


Ala 


Pro 


Gin 


His 


Leu 


Leu 


Gly 


Pro 


Gly 


Pro 




450 










465 










460 










Val 


Ala 


Gly 


Pro 


Lys 


Leu 


Tyr 


Pro 


Lys 


Leu 


Tyr 


Thr 


Asp 


He 


His 


Thr 


465 










470 










475 










480 


His 


Thr 


His 


Thr 


His 


Ser 


His 


Thr 


His 


Ser 


His 


Val 


Glu 


Qly 


Lys 


Val 










485 










490 










495 




His 


Gin 


His 


lie 


His 


Tyr 


Gin 


Cys 



















<210> 6 

<:211> 1512 

<212> DNA 

<213> Homo sapiens 



<400 
atgacgccga 
ccggccgccg 
gcccggctgg 
accatgtgga 
ccgcaggggc 
gccaccaacg 
agcccaggga 
agccagcagt 
gcacggcccg 
gacatcacgt 
aagaagaagt 
tgccgcgtgt 
cggacccgtt 
999999^0 ca 
ctgaagcgcg 
cagaagtttg 
aataagctgc 
gccaacacca 
ccgccagggc 
atcggcatcc 
gcccagaaga 
gggacggccc 
ggccctggtg 
ggcccaggcc 
cacacacaca 



> 6 
gccccctgtt 
ccgcccgagg 
gccgcactgt 
ccaaggatgg 
tgaaggtgaa 
gcttcggcag 
aggagagcct 
gggcacgacc 
tgggtagctc 
ggatgaagga 
ggacactgag 
cgaaccgcgc 
ccaagcccgt 
cgtccttcca 
tggagtacgg 
tggtgctgcc 
tcatcacccg 
tgggctacag 
cacctgtggc 
cagccggcgc 
agccgtgcac 
gcgaccgcag 
tggggctgtg 
cagttgctgg 
cacactctca 



gctgctcctg 
ccccccaaag 
gcggctgcag 
ccgcaccatc 
gcaggtggag 
cctgagcgtc 
ggggcccgac 
gcgcttcaca 
cgtgcggctc 
cgaccaggcc 
cctgaagaac 
gggcgccatc 
gctcacaggc 
gtgcaaggtg 
cgccgagggc 
cacgggtgac 
tgcccgccag 
cttccgcagc 
ctcctcgtcc 
tgtcttcatc 
ccccgcgcct 
cggagacaag 
tgaggagcat 
ccctaagttg 
cacacactca 



ctgccgccgc 
atggcggaca 
tgcccagtgg 
cacagcggct 
cgggaggatg 
aactacaccc 
agctcctctg 
cagccctcca 
aagtgcgtgg 
ttgacgcgcc 
ctgcggccgg 
aacgccacct 
acgcaccccg 
cgcagcgacg 
cgccacaact 
gtgtggtcgc 
gacgatgcgg 
gccttcctca 
tcggccacta 
ctgggcaccc 
gcccctcccc 
gaccttccct 
gggtctccgg 
taccccaaac 
cacgtggagg 



tgctgctggg 
aggtggtccc 

^ggg99accc 

ggagccgctt 
ccggcgtgta 
tcgtcgtgct 
ggggtcaaga 
agatgaggcg 
ccagcgggca 
cagaggccgc 
aggacagcgg 
acaaggtgga 
tgaacacgad 
tgaagccggt 
ccaccatcga 
ggcccgacgg 
gcatgtacat 
ccgtgctgcc 
gcctgccgtg 
tgctcctgtg 
tgcctgggca 
cgttggccgc 
cagcccccca 
tctacacaga 
gcaaggtcca 



ggccttcccg 
acggcaggtg 
gccgccgctg 
ccgcgtgctg 
cgtgtgcaag 
ggatgacatt 
ggaccccgcc 
ccgggtgatc 
ccctcggccc 
tgagcccagg 
caaatacacc 
tgtgatccag 
ggtggacttc 
gatccagtgg 
tgtgggcggc 
ctcctacctc 
ctgccttggc 
agacccaaaa 
gcccgtggtc 
gctttgccag 
ccgcccgccg 
cctcagcgct 
gcacttactg 
catccacaca 
ccagcacatc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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cactatcagt gc 



1512 



<210> 7 

<211> 1074 

<2i2> DNA 

<213> Mus mus cuius 

. <220> 

<221> CDS 

<222> (3) . , , (626) 

<221> modif ied_base 
<222> all "n" positions 
<223> nsa, c, g, or t 

<400> 7 

ca cgc gtc egg ccc acg ggt gat gtg tgg tea egg cct gat ggc tec 47 

Arg Val Arg Pro Thr Gly Asp Val Trp Ser Arg Pro Asp Gly Ser 
1 5 10 IS 

tac etc aac aag ctg etc ate tct egg gcc cgc eag gat gat get ggc 95 
ryr Leu Asn Lys Leu Leu He Ser Arg Ala Arg Gin Asp Asp Ala Gly 
20 25 30 

atg tac ate tgc eta ggt gca aat acc atg ggc tac agt ttc cgt age 143 
Met Tyr He Cys Leu Gly Ala Asn Thr Met Gly Tyr Ser Phe Arg Ser 
35 40 45 

gcc ttc etc act gta tta cca gac ccc aaa cct cca ggg cct cct atg 191 
Ala Phe Leu Thr Val Leu Pro Asp Pro Lys Pro Pro Gly Pro Pro Met 
50 55 60 

get tct tea teg tea tec aca age ctg cca tgg cct gtg gtg ate ggc 23 9 

Ala Ser Ser Ser Ser Ser Thr Ser Leu Pro Trp Pro Val Val He Gly 
65 70 75 

ate cca get ggt get gtc ttc ate eta ggc act gtg ctg etc tgg ctt 287 
He Pro Ala Gly Ala Val Phe He Leu Gly Thr Val Leu Leu Trp Leu 
80 85 90 95 . 

tgc eag acc aag aag aag cca tgt gcc cca gca tct aca ctt cct gtg 335 
Cys Gin Thr Lys Lys Lys Pro Cys Ala Pro Ala Ser Thr Leu Pro Val 
100 105 110 

cct ggg cat cgt ccc cca ggg aca tec ega gaa cgc agt ggt gac aag 383 
Pro Gly His Arg Pro Pro Gly Thr Ser Arg Glu Arg Ser Gly Asp Lys 
. 115 120 125 

gac ctg ccc tea ttg get gtg ggc ata tgt gag gag cat gga tec gcc 431 
Asp Leu Pro Ser Leu Ala Val Gly He Cys Glu Glu His Gly Ser Ala 
130 135 140 

atg gcc ccc eag cac ate ctg gcc tct ggc tea act get ggc ccc aag 479 
Met Ala Pro Gin His He Leu Ala Ser Gly Ser Thr Ala Gly Pro Lys 
145 150 155 

ctg tac GCC aag eta tac aca gat gtg cac aca cac aca cat aca cac 527 
Leu Tyr Pro Lys Leu Tyr Thr Asp Val His Thr His Thr His Thr His 



9 
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160 165 170 175 

acc tgc act cac acg etc tea tgt tgg agg gca agg ttc ate aac ace 575 
Thr Cys Thr His Thr Leu Ser Cys Trp Arg Ala Arg Phe lie Asn Thr 
180 185 190 

age atg tec act ate agt get aaa tac age gaa tct eca age act gtg 623 
Ser Met Ser Thr lie Ser Ala Lys Tyr Ser Glu Ser Pro Ser Thr Val 
195 200 205 

tec tgaggtaggc atttgggggc caaggcaaca ggttgggaga attgagaaca 676 
ser 



atggaggaag agtatcttag ggtgccttat ggtggacact cacaaacttg gccatataga 736 

tgtatgtact accagatgaa cagccagcca gattcacaca cgcacatgtt taaacgtgta 796 

aacgtgtgca caactgcaca cacaacctga gaaaccttca ggaggatttg tggtgtgact 856 

ttgcagtgac atgtagcgat ggctagttga aggaatctcc ctcatgtctt agtggtcatg 916 

gccacttccc cacccetgcc catctgtgtt cctgcctggc cttggtggtg cttccgtgtg 976 

ccctgggttt tccaggaacc ctatcaacct gactggggtg agcagtgcag ccatgcntgg 1036 

aggtttgagc caccctcccc ttgctagaga gaagggcn 1074 

<210> 8 

<211> 208 

<212> PRT . 

<213> Mus musculus 

<400> 8 

Arg Val Arg Pro Thr Gly Asp Val Trp Ser Arg Pro Asp Gly Ser Tyr 

1 5 10 15 

Leu Asn Lys Leu Leu He Ser Arg Ala Arg Gin Asp Asp Ala Gly Met 

20 25 30 

Tyr He Cys Leu Gly Ala Asn Thr Met Gly Tyr Ser Phe Arg Ser Ala 

35 40 45 

Phe Leu Thr Val Leu Pro Asp Pro Lys Pro Pro Gly Pro Pro Met Ala 

50 55 60 

Ser Ser Ser Ser Ser Thr Ser Leu Pro Trp Pro Val Val He Gly He 
6S 70 75 80 

Pro Ala Gly Ala Val Phe He Leu Gly Thr Val Leu Leu Trp Leu Cys 

85 90 95 

Gin Thr Lys Lys Lys Pro Cys. Ala Pro Ala ser Thr Leu Pro Val Pro 

100 lOS 110 

Gly His. Arg Pro Pro Gly Thr Ser Arg Glu Arg Ser Gly Asp Lys Asp 

115 120 125 

Leu Pro Ser Leu Ala Val Gly He Cys Glu Glu His Gly Ser Ala Met 

130 135 140 

Ala Pro Gin His He Leu Ala Ser Gly Ser Thr Ala Gly Pro Lys Leu 
145 150 155 . 160 

Tyr Pro Lys Leu Tyr Thr Asp Val His Thr His Thr His Thr His Thr 

165 170 175 

Cys Thr His Thr Leu Ser Cys Trp Arg Ala Arg Phe He Asn Thr Ser 

180 185 190 

Met Ser Thr He Ser Ala Lys Tyr Ser Glu Ser Pro Ser Thr Val Ser 
195 200 205 

<210> 9 
<211> 624 
<212> DNA 
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<213> Mus mus cuius 



<400 
cgcgtccggc 
ctcatctctc 
atgggctaca 
cctcctatgg 
ccagctggtg 
aagccatgtg 
cgagaacgca 
ggatccgcca 
taccccaagc 
ctctcatgtt 
agcgaatctc 



> 9 
ccacgggtga 
gggcccgcca 
gtttccgtag 
cttcttcatc 
ctgtcttcat 
ccccagcatc 
gtggtgacaa 
tggcccccca 
tatacacaga 
ggagggcaag 
caagcactgt 



tgtgtggtca 
ggatgatgct 
cgccttcctc 
gtcatccaca 
cctaggcact 
tacacttcct 
ggacctgccc 
gcacatcctg 
tgtgcacaca 
gttcatcaac 
gtcc 



cggcctgatg 
ggcatgtaca 
actgtattac 
agcctgccat 
gtgctgctct 
gtgcctgggc 
ccattggctg 
gcctctggct 
cacacacata 
accagcatgt 



gctcctacct 
tctgcctagg 
cagaccccaa 
ggcctgcggt 
ggctttgcca 
atcgtccccc 
tgggcatatg 
caactgctgg 
cacacacctg 
ccactatcag 



caacaagctg 
tgcaaatacc 
acctccaggg 
gatcggcatc 
gaccaagaag 
agggacatcc 
tgaggagcat 
ccccaagctg 
cactcacacg 
tgctaaatac 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
624 



<210> 10 

<211> 1423 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (31) . . . (444) 



<400> 10 

gtcgacccac gcgtccgccc acgcgtccgg atg cct gga ccc aga gtg tgg ggg 

Met Pro Gly Pro Arg Val Trp Gly 
1 S 



54 



aaa tat etc tgg aga age cct cac tec aaa ggc tgt cca ggc gca atg 
Lys Tyr Leu Trp Arg Ser Pro His Ser Lys Gly Cys Pro Gly Ala Met 
10 15 20 



102 



tgg tgg ctg ctt etc tgg gga gtc etc cag get tgc cca ace egg gge 
Trp Trp Leu Leu Leu Trp Gly Val Leu Gin Ala Cys Pro Thr Arg Gly 
25 30 35 40 



150 



tec gtc etc ttg gee caa gag eta ccc cag cag ctg aca tec ccc ggg 
Ser Val Leu Leu Ala Gin Glu Leu Pro Gin Gin Leu Thr Ser Pro Gly 
45 50 55 



198 



tac cca gag ccg tat ggc aaa ggc caa gag age age acg gae ate aag 
Tyr Pro Glu Pro Tyr Gly Lys Gly Gin Glu Ser Ser Thr Asp lie Lys 
60 65 70 



246 



get cca gag gge ttt get gtg agg etc gtc ttc cag gac ttc gac ctg 
Ala Pro Glu Gly Phe Ala Val Arg Leu Val Phe Gin Asp Phe Asp Leu 
75 80 85 



294 



gag ccg tec cag gae tgt gca ggg gac tct gtc aca gtg age tgg gga 
Glu Pro Ser Gin Asp Cys Ala Gly Asp Ser Val Thr Val Ser Trp Gly 
90 95 100 



342 



tgg ggg ggg tec cgc cag gac tgt ggc cag gga gat tec egg ggt tgt 
Trp Gly Gly Ser Arg Gin Asp Cys Gly Gin Gly Asp Ser Arg Gly Cys 
lOS 110 115 120 



390 



ggg aag tgg egg tgc cct gaa tec ccc ate tgg agg agg gat gaa ttt 



438 
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Gly Lys Trp Airg Cys Pro Glu Ser Pro He Trp Arg Arg Asp Glu Phe 
125 130 135 

tec atg taggggcagt cgggcttggc ttaccgggga gcagtggtgg accccaggac 494 
Ser Met 



acagcctccc 
caagcaggcc 
agccaggggc 
tgcctgctct 
tcacacttac 
aggacatttt 
tcaggagact 
caacatagca 
attatctttt 
tttaatttta 
taattaaagt 
aggctgaggc 
aagaccccat 
gtcccagcta 
cagtgagctc 
caaaaaaaaa 



accagcgcct 
ctgcgtttgg 
agggacgcac 
atcaggtgag 
cttggaagag 
aaaaacagta 
gaggctggag 
agaccccatc 
gatttaaatt 
aaatttttaa 
tttctaaggc 
gaaagaagca 
ctctacaaaa 
caagggacgc 
tgatcatgac 
aaaaaaaaag 



ccggggctgc 
aaggcttatg 
atattggttg 
gaagctggac 
ctattacaaa 
cttgatggag 
gatcagaggg 
tcaaaaataa 
ttatttatat 
ttattatgga 
tgggcgcagt 
cttgagccca 
aaatttaaaa 
tgaagtgaga 
accgtactcc 
ggcggccgc 



catctgggcc 
aatggacaca 
ttaaaaatat 
acaaataata 
acttctaacg 
tgatgcaagc 
ctggagccca 
gtaaataata 
caaaatgaca 
tacataatag 
agctcatgtc 
ggaatttgag 
attagccaag 
ggatcacttg 
agcctgggtg 



ccacagagca 
caaatcttgc 
gtcatcatgt 
acaaaagatt 
ccaaagcctt 
ttgcagtccc 
gggttcaagg 
aataaaaata 
taaatttttg 
ttgtaagact 
tgtagtccca 
accagcctgg 
tgtggtggca 
agcctggaag 
acagagtgag 



aagagggcag 
aaatctatgg 
atttgttgag 
aagtcaccgt 
attcagaata 
agcagtatag 
ccagcctaag 
aaaagagcac 
aactttattt 
ttttgttttt 
gcactttggg 
gcaacatagc 
cgcacctgtg 
gtagaggctg 
accctgtctc 



554 
614 
674 
734 
794 
854 
914 
974 
1034 
1094 
1154 
1214 
1274 
1334 
1394 
1423 



<210> 11 
<211> 138 
<212> PRT 

<213> Homo sapiens 



<400> 11 



Met 


Pro 


Gly 


Pro 


Arg 


Val 


Trp 


Gly 


1 








5 








ser 


Lys 


Gly 


Cys 


Pro 


Gly 


Ala 


Met 








20 










Leu 


Gin 


Ala 


Cys 


Pro 


Thr 


Arg 


Gly 






35 










40 


Pro 


Gin 


Gin 


Leu 


Thr 


Ser 


Pro 


Gly 




SO 










55 




Gin 


Glu 


Ser 


Ser 


Thr 


Asp 


He 


Lys 


65 










70 






Leu 


Val 


Phe 


Gin 


Asp 


Phe 


Asp 


Leu 


Asp 


Ser 


Val 


Thr 


85 
Val 


Ser 


Trp 


Gly 








100 










Gly 


Gin 


Gly 


Asp 


Ser 


Arg 


Gly 


Cys 






115 










120 


Pro 


He 


Trp 


Arg 


Arg 


Asp 


Glu 


Phe 




130 










135 





Lys 


Tyr 


Leu 


Trp 


Arg 


Ser 


Pro 


His 




10 










15 




Trp 


Trp 


Leu 


Leu 


Leu 


Trp 


Gly 


Val 


25 










30 






Ser 


Val 


Leu 


Leu 


Ala 


Gin 


Glu 


Leu 










45 








Tyr 


pro 


Glu 


Pro 


Tyr 


Gly 


Lys 


Gly 








60 










Ala 


Pro 


Glu 


Gly 


Phe 


Ala 


Val 


Arg 






75 










80 


Glu 


Pro 


Ser 


Gin 


Asp 


Cys 


Ala 


Gly 




90 










95 




Trp 


Gly 


Gly 


Ser 


Arg 


Gin 


Asp 


Cys 


105 










110 






Gly 


Lys 


Trp 


Arg 


Cys 


Pro 


Glu 


Ser 










125 








Ser 


Met 















<210> 12 
<211> 414 

<212> DNA 

<213> Homo sapiens 



<400> 12 

atgcctggac ccagagtgtg ggggaaatat ctctggagaa gccctcactc caaaggctgt 60 

ccaggcgcaa tgtggtggct gcttctctgg ggagtcctcc aggcttgccc aacccggggc 120 

tccgt octet tggcccaaga gctaccccag cagctgacat cccccgggta cccagagccg 180 
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tatggcaaag gccaagagag cagcacggac atcaaggctc cagagggctt tgctgtgagg 240 

ctcgtcttcc aggacttcga cctggagccg tcccaggact gtgcagggga ctctgtcaca 300 

gtgagctggg gatggggggg gtcccgccag gactgtggcc agggagattc ccggggttgt 360 

gggaagtggc ggtgccctga atcccccatc tggaggaggg atgaattttc catg 414 

<210> 13 
<211> 5036 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (230) . , , (3379) 
<400> 13 

gtcgacccac gcgtccgctc gaagcgggga ccctcgcccc gtcctcggct gtccagtcct 60 
cctcctcgca gaccccggcg gttcctaccc caggccgcag gggagacggt gccccaaggc 120 
aggcttcata tcctgaacgc tgggatcccc caggacattc cctggccccc aggccccagg 180 
tcccaggccc cagggctgag ctgtgggcag gccccacctg gcctctgca atg tea ccg 23 8 

Met Sex Pro 
1 

cct ctg tgt ccc etc ctt etc ctg get gtg ggc ctg egg ctg get gga 286 
Pro Leu Cys Pro Leu Leu Leu Leu Ala Val Gly Leu Arg Leu Ala Gly 
5 10 15 

act etc aac ccc agt gat ccc aat acc tgc age ttc tgg gaa age ttc 334 
Thr Leu Asn Pro Ser Asp Pro Asn Thr Cys Ser Phe Trp Glu Ser Phe 
20 25 30 35 

act acc acc acc aag gag tec cac tec cgc ccc ttc age ctg etc ccc 382 
Thr Thr Thr Thr Lys Glu Ser His Ser Arg Pro Phe Ser Leu Leu Pro 
40 45 SO 

tea gag ccc tgc gag egg ccc tgg gag ggc ccc cat act tgc ccc age 430 
Ser Glu Pro Cys Glu Arg Pro Trp Glu Gly Pro His Thr Cys Pro Ser 
55 60 65 

cea caa act cag agg aaa etc ctg get tct agg gat tea ttc tgc atg 478 
Pro Gin Thr Gin Arg Lys Leu Leu Ala Ser Arg Asp Ser Phe Cys Met 
70 75 80 

gtc tgt gtc ggg get gga gtg cag tgg cga gat cgt agt gca ctg caa 52 6 

Val Cys Val Gly Ala Gly Val Gin Trp Arg Asp Arg Ser Ala Leu Gin 
85 90 95 

cct caa aca ggg aat gcg ctt tct atg cgc cct cag ccc aga gtg ttg 574 
Pro Gin Thr Gly Asn Ala Leu Ser Met Arg Pro Gin Pro Arg Val Leu 
100 105 110 115 

agt ggt gee cct tec ctg gee tec cct ggc cac act gtg gtg gtg aag 622 
Ser Gly Ala Pro Ser Leu Ala Ser Pro Gly His Thr Val Val Val Lys 
120 125 130 

acg gac cac cgc cag cgc ctg cag tgc tgc cat ggc ttc tat gag age 670 
Thr Asp His Arg Gin Arg Leu Gin Cys Cys His Gly Phe Tyr Glu Ser 
135 140 145 
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agg ggg ttc tgt gtc ccg etc tgt gcc cag gag tgt gtc cat ggc cgt 718 

Arg Gly Phe Cys Val Pro Leu Cys Ala Gin Glu Cys Val His Gly Arg 
150 155 160 

tgt gtg gca ccc aat cag tgc caa tgt gtg oca ggc tgg egg ggc gac 766 

Cys Val Ala Pro Asn Gin Cys Gin Cys Val Pro Gly Trp Arg Gly Asp 
165 170 175 

gac tgt tec agt gcc ccg aac tgc ctt cag ccc tgt ace cct ggc tac 814 

Asp Cys Ser Ser Ala Pro Asn Cys Leu Gin Pro Cys Thr Pro Gly Tyr 

180 185 190 195 



tat ggc cct gcc tgc cag ttc cgc tgc cag tgc cat ggg gca ccc tgc 
Tyr Gly Pro Ala Cys Gin Phe Arg Cys Gin Cys His Gly Ala Pro Cys 
200 205 210 



862 



gat ccc cag act gga gcc tgc ttc tgc ccc gca gag aga act ggg ccc 910 
Asp Pro Gin Thr Gly Ala Cys Phe Cys Pro Ala Glu Arg Thr Gly Pro 
.215 220 225 

age tgt gac gtg tec tgt tec cag ggc act tct ggc ttc ttc tgc ccc 958 
Ser Cys Asp Val Ser Cys Ser Gin Gly Thr Ser Gly Phe Phe Cys Pro 
230 235 240 

age ace cat cct tgc caa aat gga ggt gtc ttc caa acc cca cag ggc 1006 
Ser Thr His Pro Cys Gin Asn Gly Gly Val Phe Gin Thr Pro Gin Gly 
245 250 255 

tec tgc age tgc ccc cct ggc tgg atg ggc acc ate tgc tec ctg ccc 1054 
Ser Cys Ser Cys Pro Pro Gly Trp Met Gly Thr lie Cys Ser Leu Pro 
260 265 270 275 

tgc cca gag ggc ttt cac gga ccc aac tgc tec cag gaa tgt cgc tgc 1102 
Cys Pro Glu Gly Phe His Gly Pro Asn Cys Ser Gin Glu Cys Arg Cys 
280 285 ' 290 

cac aac ggc ggc etc tgt gac cga ttc act ggg cag tgc cgc tgc get 1150 
His Asn Gly Gly Leu Cys Asp Arg Phe Thr Gly Gin Cys Arg Cys Ala 
295 300 305 

ccg ggt tac act ggg gat egg tgc egg gag gag tgc ccg gtg ggc cgc 1198 
Pro Gly Tyr Thr Gly Asp Arg Cys Arg Glu Glu Cys Pro Val Gly Arg 
310 315 320 

ttt ggg cag gac tgt get gag aeg tgc gac tgc gcc ccg gac gcc cgt 1246 
Phe Gly Gin Asp Cys Ala Glu Thr Cys Asp Cys Ala Pro Asp Ala Arg 
325 330 335 

tgc ttc ccg gcc aac ggc gca tgt ctg tgc gaa cac ggc ttc act ggg 12 94 

Cys Phe Pro Ala Asn Gly Ala Cys Leu Cys Glu His Gly Phe Thr Gly 
340 345 350 355 

gac cgc tgc aeg gat cgc etc tgc ccc gac ggc ttc tac ggt etc age 1342 
Asp Arg Cys Thr Asp Arg Leu Cys Pro Asp Gly Phe Tyr Gly Leu Ser 
360 365 370 

tgc cag gcc ccc tgc ace tgc gac egg gag cac age etc age tgc cac 1390 
Cys Gin Ala Pro Cys Thr Cys Asp Arg Glu His Ser Leu Ser Cys His 

14 
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375 380 385 

ccg atg aac ggg gag tgc tec tgc ctg ccg ggc tgg gcg ggc etc cac 1438 

Pro Met Asn Gly Glu Cys Ser Cys Leu Pro Gly Trp Ala Gly Leu His 

390 395 400 

tgc aac gag age tgc ccg cag gac acg cat ggg cca ggg tgc cag gag 1486 

Cys Asn Glu Ser Cys Pro Gin Asp Thr His Gly Pro Gly Cys Gin Glu 
405 410 415 

cac tgt etc tgc ctg cac ggt ggc gtc tgc cag get ace age ggc etc 1534 

His Cys Leu Cys Leu His Gly Gly Val Cys Gin Ala Thr Ser Gly Leu 

420 425 430 435 

tgt cag tgc gcg ccg ggt tac acg ggc cct cac tgt get agt ctt tgt 1582 

Cys Gin Cys Ala Pro Gly Tyr Thr Gly Pro His Cys Ala Ser Leu Cys 

440 445 450 

cct cct gac ace tac ggt gtc aac tgt tct gca cgc tgc tea tgt gaa 1630 

Pro Pro Asp Thr Tyr Gly Val Asn Cys Ser Ala Arg Cys Ser Cys Glu 
455 460 465 

aat gcc ate gee tgc tea ccc ate gac ggc gag tgc gtc tgc aag gaa 1678 

Asn Ala lie Ala cys Ser Pro lie Asp Gly Glu Cys Val Cys Lys Glu 

470 475 480 

ggt tgg cag cgt ggt aac tgc tct gtg ccc tgc cca ccc gga acc tgg 1726 

Gly Trp Gin Arg Gly Asn Cys Ser Val Pro Cys Pro Pro Gly Thr Trp 
485 490 495 

ggc ttc agt tgc aat gcc age tgc cag tgt gcc cat gag gea gtc tgc 1774 

Gly Phe Ser Cys Asn Ala Ser Cys Gin Cys Ala His Glu Ala Val Cys 

SOO 505 510 SIS 

age ccc caa act gga gcc tgt ace tgc ace cct ggg tgg cat ggg gcc 1822 

Ser Pro Gin Thr Gly Ala Cys Thr Cys Thr Pro Gly Trp His Gly Ala 

520 525 530 

cac tgc cag ctg ccc tgt ccg aag ggg cag ttt gga gaa ggt tgt gcc 1870 

His Cys Gin Leu Pro Cys Pro Lys Gly Gin Phe Gly Glu Gly Cys Ala 
535 540 545 

agt cgc tgt gac tgt gac cac tct gat ggc tgt gac cct gtt cat gga 1918 

Ser Arg Cys Asp Cys Asp His Ser Asp Gly Cys Asp Pro Val His Gly 

550 555 560 

cgc tgt cag tgc cag get ggc tgg atg ggt gcc cgc tgc cac ctg tec 1966 

Arg Cys Gin Cys Gin Ala Gly Trp Met Gly Ala Arg Cys His Leu Ser 
565 570 575 

tgc cct gag ggc tta tgg gga gtc aac tgt age aac acc tgc acc tgc 2014 

Cys Pro Glu Gly Leu Trp Gly Val Asn Cys Ser Asn Thr Cys Thr Cys 

580 585 590 595 

aag aat ggg ggc acc tgt etc cct gag aat ggc aac tgc gtg tgt gca 2 062 

Lys Asn Gly Gly Thr Cys Leu Pro Glu Asn Gly Asn Cys Val Cys Ala 

600 605 610 
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ccc gga ttc egg ggc ccc tec tgc cag aga tec tgt cag cct ggc cgc 2110 

Pro Gly Phe Arg Gly Pro Ser Cys Gin Arg Ser Cys Gin Pro Gly Arg 
615 620 625 

tat ggc aaa cgc tgt gtg ccc tgc aag tgc get aac cac tec ttc tgc 2158 

Tyr Gly Lys Arg Cys Val Pro Cys Lys cys Ala Asn His Ser Phe Cys 
630 $35 640 

cac ccc teg aac ggg acc tgc tac tgc ctg get ggc tgg aca ggc ccc 2206 

His Pro Ser Asn Gly Thr Cys Tyr Cys Leu Ala Gly Trp Thr Gly Pro 
645 650 655 

gac tgc tec cag cca tgc cct cca gga cac tgg gga gaa aac tgt gcc 2254 

Asp Cys Ser Gin Pro Cys Pro Pro Gly His Trp Gly Glu Asn Cys Ala 
660 .665 670 675 

cag acc tgc caa tgt cac cat ggt ggg acc tgc cat ccc cag gat ggg 2302 

Gin Thr Cys Gin Cys His His Gly Gly Thr Cys His Pro Gin Asp Gly 
680 685 690 

age tgt ate tgc ccc eta ggc tgg act gga cac cac tgc tta gaa ggc 2350 

Ser Cys lie Cys Pro Leu Gly Trp Thr Gly His His Cys Leu Glu Gly 
695 700 705 

tgc cct ctg ggg aca ttt ggt get aac tgc tec cag cca tgc cag tgt 23 98 

Cys Pro Leu Gly Thr Phe Gly Ala Asn Cys Ser Gin Pro Cys Gin Cys 
710 715 720 

ggt cct gga gaa aag tgc cac cca gag act ggg gcc tgt gta tgt ccc 2446 

Gly Pro Gly Glu Lys Cys His Pro Glu Thr Gly Ala Cys Val Cys Pro 
725 730 735 

cca ggg cac agt ggt gca cct tgc agg att gga ate cag gag ccc ttt 24 94 

Pro Gly His Ser Gly Ala Pro Cys Arg lie Gly lie Gin Glu Pro Phe 

740 745 750 755 

act gtg atg ccg acc act cca gta gcg tat aac teg ctg ggt gca gtg 2542 

Thr Val Met Pro Thr Thr Pro Val Ala Tyr Asn Ser Leu Gly Ala Val 
760 765 770 

att ggc att gca gtg ctg ggg tec ctt gtg gta gcc ctg gtg gca ctg 2590 

He Gly lie Ala Val Leu Gly Ser Leu Val Val Ala Leu Val Ala Leu 
775 780 785 

ttc att ggc tat egg cac tgg caa aaa ggc aag gag cac cac cac ctg 263 8 

Phe lie Gly Tyr Arg His Trp Gin Lys Gly Lys Glu His His His Leu 
790 795 800 

get gtg get tac age age ggg cgc ctg gac ggc tec gag tat gtc atg 2686 

Ala Val Ala Tyr Ser Ser Gly Arg Leu Asp Gly Ser Glu Tyr Val Met 
805 810 815 

cca gat gtc cct ccg age tac agt cac tac tac tec aac ccc age tac 2734 

Pro Asp Val Pro Pro Ser Tyr Ser His Tyr Tyr Ser Asn Pro Ser Tyr 

820 825 830 835 

cac ace ctg teg cag tgc tec cca aac ccc cca ccc cct aac aag gtt 2782 

His Thr Leu Ser Glh Cys Ser Pro Asn Pro Pro Pro Pro Asn Lys Val 

16 
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840 845 850 

cca ggc ccg etc ttt gcc age ctg cag aac cct gag egg oca ggt ggg 2830 
Pro Gly Pro Leu Phe Ala Ser Leu Gin AsnPro Glu Arg Pro Gly Gly 
855 860 865 

gcc caa ggg cat gat aac cac acc acc ctg cct get gac tgg aag cac 2878 
Ala Gin Gly His Asp Asn His Thr Thr Leu Pro Ala Asp Trp Lys His 
870 875 880 

cgc egg gag ccc cct cca ggg cct ctg gac agg ggg age age cgc ctg 2926 
Arg Arg Glu Pro Pro Pro Gly Pro Leu Asp Arg Gly Ser Ser Arg Leu 
886 890 895 

gac cga age tac age tat age tac age aat ggc cca ggc cca ttc tac 2974 
Asp Arg Ser Tyr Ser Tyr Ser Tyr Ser Asn Gly Pro Gly Pro Phe Tyr 
900 905 910 915 

gat aaa ggg etc ate tct gaa gag gag etc ggg gcc agt gtg get tec 3022 
Asp Lys Gly Leu lie ser Glu Glu Glu Leu Gly Ala Ser Val Ala Ser 
920 925 930 

ctg age agt gag aac cca tat gcc acc ate egg gac ctg ccc age ttg 3070 
Leu Ser Ser Glu Asn Pro Tyr Ala Thr lie Arg Asp Leu Pro Ser Leu 
935 940 945 

cca ggg ggc ccc egg gag age age tac atg gag atg aaa ggc cct ccc 3118 
Pro Gly Gly Pro Arg Glu Ser Ser Tyr Met Glu Met Lys Gly Pro Pro 
950 955 960 

tea gga tct gcc ccc agg cag cct cct cag ttt tgg gac age cag agg 3166 
Ser Gly Ser Ala Pro Arg Gin Pro Pro Gin Phe Trp Asp Ser Gin Arg 
965 970 975 

^99 egg caa ccc cag cca cag aga gac agt ggc acc tac gag cag ccc 3214 
Arg Arg Gin Pro Gin Pro Gin Arg Asp Ser Gly Thr Tyr Glu Gin Pro 
980 985 990 995 

age ccc ctg ate cat gac cga gac tct gtg ggc tec cag ccc cct ctg 3262 
Ser Pro Leu lie His Asp Arg Asp Ser Val Gly Ser Gin Pro Pro Leu 
1000 1005 1010 

cct ccg ggc eta ccc ccc ggc cac tat gac tea ccc aag aac age cac 3310 
Pro Pro Gly Leu Pro Pro Gly His Tyr Asp Ser Pro Lys Asn Ser His 
1015 1020 1025 

ate cct gga cat tat gac ttg cct cca gta egg cat ccc cca tea cct 3358 
lie pro Gly His Tyr Asp Leu Pro Pro Val Arg His Pro Pro Ser Pro 
1030 1035 1040 

cca ett cga cgc cag gac cgt tgaggagcea ggatggtatg geagaggeca 3409 
Pro Leu Arg Arg Gin Asp Arg 
1045 1050 

gcacacctgg ctgttgctgc tcaaggctgg ggacagagcc tagtgtaccc ctgccaggag 3469 

cagggagtgg accggcaggc tgtgaacatg aacaacgctt aacagagcaa gtgatgggag 3529 

cettgttcct gggttctacc atgggagacg ctgatcagea ggatgcctgg etccctttec 35 89 

caacccactg cteceaaggc ctccagggcc ctgtgtacat aaactggtgg gttggaagtt 3649 
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gctgggtaac tctgatttca gacatgcgtg tggggtacct tttctgtgca tgctcagcct 3 70 9 

gggctctgtg cgtgtgtgtg tttctgtgat tttagaaggg taccaggcag gttctgtcct 376 9 

agggcactta ccatttagta gggagatgga accaacccaa ttaactctag caatagcctc 3829 

ctaactggcc tcctccattg attcagtgaa ccttccaatg catggctcat aatttcaaaa 3889 

tacaggctgg ttagttactc cctacctgaa agccttcata ggtgcctctt tgctcttctg 3949 

ccagtatcaa aacttttgaa ggccttaaag gccctgcttt gcctggccca tctgtctctc 4009 

cagcctcacc ttgaactgtg ttcctgtcac tgcacgccag tcacaccggc ctctaggtcc 4069 

tcctgtaggc cactcttctt tctggcacag ggacctgcac acctggagtg cccttcctcc 4129 

cccactcgcc tgttcacccc tgcttttcct ttacacctcc tcctcaggga agtgcccacc 4189 

ctccgtacat ctttcacagc cctgattgca gctgtgttca ctcaccaggt acctgcagaa 424 9 

ggcctacagg gtgccaggca cttctttaat gggttctttc tttatgtgat tatttgatta 4309 

atctctgcct cccccactag actgtaagct ccctgaaggc aagaatcctg tgcttatgct 4369 

caatattagc tctcccttgg cacagagtag gcactcaaca aatgctcccc aaaaggctga 4429 

gtggctgact gaattaagta ccagtgacat gcagtaactg ctaagataga tgagccatct 448 9 

gtatgctctg acagttacag actgaataag ttggagactt ccctaaaggg tggcatttcc 4549 

ccagggtaac aacgcagagc tcaggtgtgg gaaggtgcca ggggcagggg tgcagagggg 4609 

ctgaggctga ggggggtgca gaggctggag aaaggataac aggagagagt atacaggcat 4669 

gccttgattt. attgcacttc acaggtagca gaatttttaa agaaattgaa ggttttggga 472 9 

catatatgtg acagcaatag gttaagaaaa gcaaagcaga gaaattgaag atttgtgtca 4789 

acactgcttt aagcaaatct gttggcacca tttttccaat agcatgtgcic cattttgggt 4849 

ctctacattg: cattttggta attgcttgca atatttcaag cattttcatt gttattatat 4909 

gtgttatagt gatctgtgat cagtgatctt tgatatatta ttgtaattgt ttcggggcgc 4969 

catgaaccgc acccatataa cacggtaaac ttaatcagca aaaaaaaaaa aaaaaaaagg 5029 

gcggccg 5036 

<210> 14 

<211> 1050 

<212> PRT 

<213> Homo sapiens 





<400> 


14 


Met 


ser 


Pro 


Pro Leu Cys Pro Leu 


1 






5 


Leu 


Ala 


Gly 


Thr Leu Asn Pro Ser 








20 


Glu 


Ser 


Phe 


Thr Thr Thr Thr Lys 






35 


40 


Leu 


Leu 


Pro 


Ser Glu Pro Cys Glu 




50 




55 


Cys 


Pro 


Ser 


Pro Gin Thr Gin Arg 


65 






70 


Phe 


Cys 


Met 


Val Cys Val Gly Ala 








85 


Ala 


Leu 


Gin 


Pro Gin Thr Gly Asn 








100 


Arg 


Val 


Leu 


Ser Gly Ala Pro Ser 






115 


120 


Val 


Val 


Lys 


Thr Asp His Arg Gin 




130 




135 


Tyr 


Glu 


Ser 


Arg Gly Phe Cys Val 


145 






150 


His 


Gly Arg 


Cys Val Ala Pro Asn 








165 


Arg 


Gly Asp 


Asp Cys Ser Ser Ala 








180 


Pro 


Gly Tyr 


Tyr Gly Pro Ala Cys 






195 


200 


Ala 


Pro 


Cys 


Asp Pro Gin Thr Gly 



Leu 


Leu 


Leu 


Ala Val Gly 


Leu 


Arg 




10 






15 




Asp 


Pro 


Asn 


Thr Cys Ser 


Phe 


Trp 


25 






30 






Glu 


Ser 


His 


Ser Arg Pro 


Phe 


Ser 








45 






Arg 


Pro 


Trp 


Glu Gly Pro 


His 


Thr 








60 






Lys 


Leu 


Leu 


Ala Ser Arg 


Asp 


Ser 






75 






80 


Gly Val 


Gin 


Trp Arg Asp 


Arg 


Ser 




90 






95 




Ala 


Leu 


Ser 


Met Arg Pro 


Gin 


Pro 


105 






110 






Leu 


Ala 


Ser 


Pro Gly His 


Thr 


Val 








125 






Arg 


Leu 


Gin 


Cys Cys His 


Gly 


Phe 








140 






Pro 


Leu 


Cys 


Ala Gin Glu 


Cys 


Val 






155 






160 


Gin 


Cys 


Gin 


Cys Val Pro 


Gly 


Trp 




170 






175 




Pro 


Asn 


Cys 


Leu Gin Pro 


Cys 


Thr 


185 






190 






Gin 


Phe 


Arg 


Cys Gin Cys 


His 


Gly 








205 






Ala 


Cys 


Phe 


Cys Pro Ala 


Glu 


Arg 
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210 










215 




Thr 


Gly 


Pro 


Ser 


Cys 


Asp 


Val 


Ser 


225 










230 






Phe 


Cys 


Pro 


Ser 


Thr 


His 


Pro 


Cys 










245 








Pro 


Gin 


Gly 


Ser 


Cys 


Ser 


Cys 


Pro 








260 










Ser 


Leu 


Pro 


Cys 


Pro 


Glu 


Gly 


Phe 






275 










280 


Cys 


Arg 


Cys 


His 


Asn 


Giy 


Gly 


Leu 




290 










295 




Arg 


Cys 


Ala 


Pro 


Gly 


Tyr 


Thr 


Gly 


305 










310 






Val 


Gly 


Arg 


Phe 


Gly 


Gin 


Asp 


Cys 










325 








Asp 


Ala 


Arg 


Cys 


Phe 


Pro 


Ala 


Asn 








340 










Phe 


Thr 


Gly 


Asp 


Arg 


Cys 


Thr 


Asp 






355 










360 


Gly 


Leu 


Ser 


Cys 


Gin 


Ala 


Pro 


Cys 




370 










375 




Ser 


Cys 


His 


Pro 


Met 


Asn 


Gly 


Glu 


3 85 










390 






Gly 


Leu 


His 


Cys 


Asn 


Glu 


Ser 


Cys 










405 








Cys 


Gin 


Glu 


His 


Cys 


Leu 


Cys 


Leu 




Gly 




420 










Ser 


Leu 


Cys 


Gin 


Cys 


Ala 


Pro 






435 










440 


Ser 


Leu 


Cys 


Pro 


Pro 


Asp 


Thr 


Tyr 




450 










455 




Ser 


Cys 


Glu 


Asn 


Ala 


lie 


Ala 


Cys 


465 










470 






Cys 


Lys 


Glu 


Gly 


Trp 


Gin 


Arg 


Gly 










485 








Gly 


Thr 


Trp 


Gly 


Phe 


Ser 


Cys 


Asn 








500 










Ala 


Val 


Cys 


Ser 


Pro 


Gin 


Thr 


Gly 






515 










520 


His 


Gly 


Ala 


His 


Cys 


Gin 


Leu 


Pro 




530 










535 




Gly 


Cys 


Ala 


Ser 


Arg 


Cys 


Asp 


Cys 


545 










550 






Val 


His 


Gly 


Arg 


Cys 


Gin 


Cys 


Gin 










565 








His 


Leu 


Ser 


Cys 


Pro 


Glu 


Gly 


Leu 








580 










Cys 


Thr 


Cys 


Lys 


Asn 


Gly 


Gly 


Thr 






595 










600 


Val 


Cys 


Ala 


Pro 


Gly 


Phe 


Arg 


Gly 




6X0 










615 




Pro 


Gly 


Arg 


Tyr 


Gly 


Lys 


Arg 


Cys 


625 










630 






Ser 


Phe 


Cys 


His 


Pro 


Ser 


Asn 


Gly 










645 








Thr 


Gly 


Pro 


Asp 


Cys 


Ser 


Gin 


Pro 








660 










Asn 


Cys 


Ala 


Gin 


Thr 


Cys 


Gin 


Cys 



220 



Cys Ser 


Gin 


Gly 


Thr 


Ser 


Gly 


Phe 




235 










240 


Gin Asn 


Gly 


Gly 


Val 


Phe 


Gin 


Thr 


250 










255 




Pro Gly 


Trp 


Met 


Gly 


Thr 


He 


Cys 


265 








270 






His Gly 


Pro 


Asn 


Cys 


Ser 


Gin 


Glu 








285 








Cys Asp 


Arg 


Phe 


Thr 


Gly 


Gin 


Cys 






300 










Asp Arg 


Cys 


Arg 


Glu 


Glu 


Cys 


Pro 




315 










320 


Ala Glu 


Thr 


Cys 


Asp 


Cys 


Ala 


Pro 


330 










335 




Gly Ala 


Cys 


Leu 


Cys 


Glu 


His 


Gly 


345 








350 






Arg Leu 


Cys 


Pro 


Asp 


Gly 


Phe 


Tyr 








365 








Thr Cys 


Asp 


Arg 


Glu 


His 


Ser 


Leu 






380 










Cys Ser 


Cys 


Leu 


Pro 


Oly 


Trp 


Ala 




395 










400 


Pro Gin 


Asp 


Thr 


His 


Gly 


Pro 


Gly 


410 










415 




His Gly 


Gly 


Val 


Cys 


Gin 


Ala 


Thr 


425 








430 






Gly Tyr 


Thr 


Gly 


Pro 


His 


Cys 


Ala 








445 








Gly Val 


Asn 


Cys 


Ser 


Ala 


Arg 


Cys 






460 










Ser Pro 


He 


Asp 


Gly 


Glu 


Cys 


Val 




475 










480 


Asn Cys 


Ser 


val 


Pro 


Cys 


Pro 


Pro 


490 










495 




Ala Ser 


Cys 


Gin 


Cys 


Ala 


His 


Glu 


505 








510 






Ala Cys 


Thr 


Cys 


Thr 


Pro 


Gly 


Trp 








525 








Cys Pro 


Lys 


Gly 


Gin 


Phe 


Gly 


Glu 






540 










Asp His 


Ser 


Asp 


Gly 


Cys 


Asp 


Pro 




555 










560 


Ala Gly 


Trp 


Met 


Gly 


Ala 


Arg 


Cys 


570 










575 




Trp Gly 


Val 


Asn 


Cys 


Ser 


Asn 


Thr 


585 








590 






Cys Leu 


Pro 


Glu 


Asn 


Gly 


Asn 


Cys 








60S 








Pro Ser 


Cys 


Gin 


Arg 


Ser 


Cys 


Gin 






620 










Val Pro 


Cys 


Lys 


Cys 


Ala 


Asn 


His 




635 










640 


Thr Cys 


Tyr 


Cys 


Leu 


Ala 


Gly 


Trp 


650 










655 




Cys Pro 


Pro 


Gly 


His 


Trp 


Gly 


Glu 


665 








670 






His His 


Gly 


Gly 


Thr 


Cys 


His 


Pro 



19 



BNSDOCID: <WO ^0100673A1J_> 







675 










680 










685 








Gin 


Asp 


Gly 


Ser 


Cys 


He 


Cys 


Pro 


Leu 


Gly 


Trp 


Thr 


Gly 


His 


His 


Cys 




690 










695 










700 










Leu 


Glu 


Gly 


Cys 


Pro 


Leu Gly 


Thr 


Phe 


Gly 


Ala 


Asn 


Cys 


Ser 


Gin 


Pro 


705 










710 










71S 










720 


Cys 


Gin 


Cys 


Gly 


Pro 


Gly Glu 


Lys 


Cys 


His 


Pro 


Glu 


Thr 


Gly Ala 


Cys 










725 










730 










735 




val 


Cys 


Pro 


Pro 


Gly 


His 


Ser 


Gly 


Ala 


Pro 


Cys 


Arg 


He 


Gly 


He 


Gin 








740 










745 










750 






Glu 


Pro 


Phe 


Thr 


Val 


Met 


Pro 


Thr 


Thr 


Pro 


Val 


Ala 


Tyr 


Asn 


Ser 


Leu 






755 










760 










765 








Gly 


Ala 


Val 


lie 


Gly 


He 


Ala 


Val 


Leu 


Gly 


Ser 


Leu 


Val 


Val 


Ala 


Leu 




770 










775 










780 










Val 


Ala 


Leu 


Phe 


He 


Gly 


Tyr 


Arg 


His 


Trp 


Gin 


Lys 


Gly 


Lys 


Glu 


His 


785 










790 










795 










800 


His 


His 


Leu 


Ala 


Val 


Ala 


Tyr 


Ser 


Ser 


Gly 


Arg 


Leu 


Asp 


Gly 


Ser 


Glu 










805 










810 










815 




Tyr 


Val 


Met 


Pro 


Asp 


Val 


Pro 


Pro 


Ser 


Tyr 


Ser 


His 


Tyr 


Tyr 


Ser 


Asn 








820 










825 










830 






Pro 


Ser 


Tyr 


His 


Thr 


Leu 


Ser 


Gin 


Cys 


Ser 


Pro 


Asn 


Pro 


Pro 


Pro 


Pro 






835 










840 










845 








Asn 


Lys 


Val 


Pro 


Gly 


Pro 


Leu 


Phe 


Ala 


Ser 


Leu 


Gin 


Asn 


Pro 


Glu 


Arg 




850 










855 










860 










Pro 


Gly 


Gly 


Ala 


Gin 


Gly 


His 


Asp 


Asn 


His 


Thr 


Thr 


Leu 


Pro 


Ala 


Asp 


865 










870 










875 










880 


Trp 


Lys 


His 


Arg 


Arg 


Glu 


Pro 


Pro 


Pro 


Gly 


Pro 


Leu 


Asp 


Arg Gly 


Ser 










885 










890 










895 




Ser 


Arg 


Leu 


Asp 


Arg 


Ser 


Tyr 


Ser 


Tyr 


Ser 


Tyr 


Ser 


Asn 


Gly 


Pro 


Gly 








900 










905 










910 






Pro 


Phe 


Tyr 


Asp 


Lys 


Gly 


Leu 


He 


Ser 


Glu 


Glu 


Glu 


Leu 


Gly Ala 


Ser 






915 










920 










925 








Val 


Ala 


Ser 


Leu 


Ser 


Ser 


Glu 


Asn 


Pro 


Tyr 


Ala 


Thr 


He 


Arg 


Asp 


Leu 




930 










935 










940 










Pro 


Ser 


Leu 


Pro 


Gly 


Gly 


Pro 


Arg 


Glu 


Ser 


Ser 


Tyr 


Met 


Glu 


Met 


Lys 


945 










950 










955 










960 


Gly 


Pro 


Pro 


Ser 


Gly 


Ser 


Ala 


Pro 


Arg 


Gin 


Pro 


Pro 


Gin 


Phe 


Trp 


Asp 










965 










970 










975 




Ser 


Gin 


Arg 


Arg 


Arg 


Gin 


Pro 


Gin 


Pro 


Gin 


Arg 


Asp 


Ser 


Gly Thr 


Tyr 








980 










985 










990 






Glu 


Gin 


Pro 


Ser 


Pro 


Leu 


He 


His 


Asp 


Arg 


Asp 


Ser 


Val 


Gly Ser 


Gin 






995 










1000 








1005 






Pro 


Pro 


Leu 


Pro 


Pro 


Gly Leu 


Pro 


Pro 


Gly 


His 


Tyr 


Asp 


Ser 


Pro 


Lys 




1010 








1015 








1020 








Asn 


Ser 


His 


lie 


Pro 


Gly His 


Tyr Asp 


Leu 


Pro 


Pro 


Val 


Arg 


His 


Pro 


1025 








1030 








1035 








1040 


Pro 


Ser 


Pro 


Pro 


Leu 


Arg Arg Gin Asp 


Arg 















1045 1050 



<210> 15 

<211> 3150 

<212> DNA 

<213> Homo sapiens 

<400> 15 

atgtcaccgc ctctgtgtcc cctccttctc ctggctgtgg gcctgcggct ggctggaact 60 

ctcaacccca gtgatcccaa tacctgcagc ttctgggaaa gcttcactac caccaccaag 120 

gagtcccact cccgcccctt cagcctgctc ccctcagagc cctgcgagcg gccctgggag 180 

ggcccccata cttgccccag cccacaaact cagaggaaac tcctggcttc tagggattca 240 
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ttctgeatgg tctgtgtcgg ggctggagtg cagtggcgag atcgtagtgc actgcaacct 3 00 

caaacaggga atgcgctttc tatgcgccct cagcccagag tgttgagtgg tgccccttcc 3 60 

ctggcctccc ctggccacac tgtggtggtg aagacggacc accgccagcg cctgcagtgc 420 

tgccatggct tctatgagag cagggggttc tgtgtcccgc tctgtgccca ggagtgtgtc 480 

catggccgtt gtgtggcacc caatcagtgc caatgtgtgc caggctggcg gggcgacgac 540 

tgttccagtg ccccgaactg ccttcagccc tgtacccctg gctactatgg ccctgcctgc 600 

cagttccgct gccagtgcca tggggcaccc tgcgatcccc agactggagc ctgcttctgc 660 

cccgcagaga gaactgggcc cagctgtgac gtgtcctgtt cccagggcac ttctggcttc 720 

ttctgcccca gcacccatcc ttgccaaaat ggaggtgtct tccaaacccc acagggctcc 780 

tgcagctgcc cccctggctg gatgggcacc atctgctccc tgccctgccc agagggcttt 84 0 

cacggaccca actgctccca ggaatgtcgc tgccacaacg gcggcctctg tgaccgattc 900 

actgggcagt gccgctgcgc tccgggttac actggggatc ggtgccggga ggagtgcccg 960 

gtgggccgct ttgggcagga ctgtgctgag acgtgcgact gcgccccgga cgcccgttgc 1020 

ttcccggcca acggcgcatg tctgtgcgaa cacggcttca ctggggaccg ctgcacggat 10 80 

cgcctctgcc ccgacggctt ctacggtctc agctgccagg ccccctgcac ctgcgaccgg 1140 

gagcacagcc tcagctgcca cccgatgaac ggggagtgct cctgcctgcc gggctgggcg 12 00 

ggcctccact gcaacgagag ctgcccgcag gacacgcatg ggccagggtg ccaggagcac 1260 

tgtctctgcc tgcacggtgg. cgtctgccag gctaccagcg gcctctgtca gtgcgcgccg 132 0 

ggttacacgg gccctcactg tgctagtctt tgtcctcctg acacctacgg tgtcaactgt 1380 

tctgcacgct gctcatgtga aaatgccatc gcctgctcac ccatcgacgg cgagtgcgtc 1440 

tgcaaggaag gttggcagcg tggtaactgc tctgtgccct gcccacccgg aacctggggc 1500 

ttcagttgca atgccagctg ccagtgtgcc catgaggcag tctgcagccc ccaaactgga 1560 

gcctgtacct gcacccctgg gtggcatggg gcccactgcc agctgccctg tccgaagggg 1620 

cagtttggag aaggttgtgc cagtcgctgt gactgtgacc actctgatgg ctgtgaccct 1680 

gttcatggac gctgtcagtg ccaggctggc tggatgggtg cccgctgcca cctgtcctgc 1740 

cctgagggct tatggggagt caactgtagc aacacctgca cctgcaagaa tgggggcacc 1800 

tgtctccctg agaatggcaa ctgcgtgtgt gcacccggat tccggggccc ctcctgccag 1860 

agatcctgtc agcctggccg ctatggcaaa cgctgtgtgc cctgcaagtg cgctaaccac 192 0 

tccttctgcc acccctcgaa cgggacctgc tactgcctgg ctggctggac aggccccgac 1980 

tgctcccagc catgccctcc aggacactgg ggagaaaact gtgcccagac ctgccaatgt 204 0 

caccatggtg ggacctgcca tccccaggat gggagctgta tctgccccct aggctggact 2100 

ggacaccact gcttagaagg ctgccctctg gggacatttg gtgctaactg ctcccagcca 2160 

tgccagtgtg gtcctggaga aaagtgccac ccagagactg gggcctgtgt atgtccccca 2220 

gggcacagtg gtgcaccttg caggattgga atccaggagc cctttactgt gatgccgacc 2280 

actccagtag cgtataactc gctgggtgca gtgattggca ttgcagtgct ggggtccctt 234 0 

gtggtagccc tggtggcact gttcattggc tatcggcact ggcaaaaagg caaggagcac 2400 

caccacctgg ctgtggctta cagcagcggg cgcctggacg gctccgagta tgtcatgcca 24 60 

gatgtccctc cgagctacag tcactactac tccaacccca gctaccacac cctgtcgcag 252 0 

tgctccccaa accccccacc ccctaacaag gttccaggcc cgctctttgc cagcctgcag 25 8 0 

aaccctgagc ggccaggtgg ggcccaaggg catgataacc acaccaccct gcctgctgac 2640 

. tggaagcacc gccgggagcc ccctccaggg cctctggaca gggggagcag ccgcctggac 2700 

cgaagctaca gctatagcta cagcaatggc ccaggcccat tctacgataa agggctcatc 2760 

tctgaagagg agctcggggc cagtgtggct tccctgagca gtgagaaccc atatgccacc 282 0 

atccgggacc tgcccagctt gccagggggc ccccgggaga gcagctacat ggagatgaaa 2880 

ggccctccct caggatctgc ccccaggcag cctcctcagt tttgggacag ccagaggcgg 2940 

cggcaacccc agccacagag agacagtggc acctacgagc agcccagccc cctgatccat 3000 

gaccgagact ctgtgggctc ccagccccct ctgcctccgg gcctaccccc cggccactat 3060 

gactcaccca agaacagcca catccctgga cattatgact tgcctccagt acggcatccc 3120 

ccatcacctc cacttcgacg ccaggaccgt 3150 

<210> 16 

<211> 2569 

<212> DNA 

<213> Mus musculus 

<220> 

<221> CDS 

<222> (2) . . . (1492) 
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<400> 16 

g teg acc cac gcg tec ggt gac cct gtt cat gga cag tgc cga tgt cag 49 

Ser Thr His Ala Ser Gly Asp Pro Val His Gly Gin Cys Arg Cys Gin 

1 5 10 IS 

get ggt tgg atg ggc aca cgc tgc cac ctg cct tgc ccg gag ggc ttt 97 
Ala Gly Trp Met Gly Thr Arg Cys His Leu Pro Cys Pro Glu Gly Phe 
20 25 30 

tgg gga gcc aac tgc agt aac acc tgt acc tgc aag aat ggt ggt acc 145 
Trp Gly Ala Asn Cys Ser Asn Thr Cys Thr Cys Lys Asn Gly Gly Thr 
35 40 45 

tgt gtg tct gag aat ggc aac tgc gtg tgc gca cca ggg ttc cga ggc 193 
Cys Val Ser Glu Asn Gly Asn Cys Val Cys Ala Pro Gly Phe Arg Gly 
50 55 60 

ccc tec tgc cag agg ccc tgc ccg cct ggt cgc tat ggc aaa cgc tgt 241 
Pro Ser Cys Gin Arg Pro Cys Pro Pro Gly Arg Tyr Gly Lys Arg Cys 
65 70 75 80 

gtg caa tgc aag tgt aac aac aac cat tct tec tgc cac cca teg gac 289 
Val Gin Cys Lys Cys Asn Asn Asn His Ser Ser Cys His Pro Ser Asp 
85 90 95 

ggg acc tgc tec tgc ctg gcg ggc tgg aca ggc cct gac tgc tec gag 337 
Gly Thr Cys Ser Cys Leu Ala Gly Trp Thr Gly Pro Asp Cys Ser Glu 
100 ICS 110 

gca tgt ccc cca ggc cac tgg gga etc aaa tgc tec caa etc tgc cag 3 85 

Ala Cys Pro Pro Gly His Trp Gly Leu Lys Cys Ser Gin Leu Cys Gin 
115 120 125 

tgt cat cat ggt ggg acc tgc cac ccc cag gat ggg age tgt ate tgc 43 3 

Cys His His Gly Gly Thr Cys His Pro Gin Asp Gly Ser Cys lie Cys 
130 135 140 

acg cca ggc tgg act gga ccc aac tgc ttg gaa ggc tgc cca cca aga 481 
Thr Pro Gly Trp Thr Gly Pro Asn Cys Leu Glu Gly Cys Pro Pro Arg 
145 150 155 160 

atg ttt ggt gtc aac tgc tec cag eta tgt cag tgt gat etc gga gag 529 
Met Phe Gly Val Asn Cys Ser Gin Leu Cys Gin Cys Asp Leu Gly Glu 
165 170 175 

atg tgc cac cca gag act ggg get tgt gtc tgt ccc cca gga cac agt 577 
Met Cys His Pro Glu Thr Gly Ala Cys Val Cys Pro Pro Gly His Ser 
180 185 190 

ggt gca gac tgc aaa atg gga age cag gag tec ttc acc ata atg ccc 625 
Gly Ala Asp Cys Lys Met Gly Ser Gin Glu Ser Phe Thr lie Met Pro 
195 200 205 

acc tct ccc gtg acc cat aac tea ctg ggt gca gtg att ggc att gca 673 
Thr Ser Pro Val Thr His Asn Ser Leu Gly Ala Val lie Gly lie Ala 
210 215 220 

gta ctg gga acc etc gtg gtg gcc ctg ata gca ctg ttc att ggc tac 721 
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Val Leu Gly Thr Leu Val Val Ala Leu lie Ala Leu Phe lie Gly Tyr 
225 230 235 240 

cgc cag tgg caa aag ggc aag gaa cat gag cac ttg gca gtg get tac 769 
Arg Gin Trp Gin Lys Gly Lys Glu His Qlu His Leu Ala Val Ala Tyr 
245 250 255 

age act ggg egg ctg gat ggc tct gat tac gtc atg cca gat gtc tct 817 
Ser Thr Gly Arg Leu Asp Gly Ser Asp Tyr Val Met Pro Asp Val Ser 
260 265 270 

ccg age tat agt cac tac tac tec aac ccc age tac cac aca ctg tct 865 
Pro Ser Tyr Ser His Tyr Tyr Ser Asn Pro Ser Tyr His Thr Leu Ser 
275 280 285 

cag tgt tct cct aac ccc ccg ccc cet aac aag gtc cca ggc agt cag 913 
Gin Cys Ser Pro Asn Pro Pro Pro Pro Asn Lys Val Pro Gly Ser Gin 
290 295 300 

etc ttt gtc age tct cag gee cct gag egg cca age aga gee eac ggg 961 
Leu Phe Val Ser Ser Gin Ala Pro Glu Arg Pro Ser Arg Ala His Gly 
305 310 315 320 

cgt gag aac cat acc aca ctg ccc get gac tgg aag cac cgc egg gag 1009 
Arg Glu Asn His Thr Thr Leu Pro Ala Asp Trp Lys His Arg Arg Glu 
325 330 335 

ccc cat gac aga ggc gcc age cac ctg gac cga age tat age tgt age 1057 
Pro His Asp Arg Gly Ala Ser His Leu Asp Arg Ser Tyr Ser Cys Ser 
340 345 350 

tat age cac agg aat ggc cca gga cca ttc 'tgt cat aaa ggt ccc ate 1105 
Tyr Ser His Arg- Asn Gly Pro Gly Pro Phe Cys His Lys Gly Pro He 
355 360 365 

tct gaa gag gga eta ggg gca age gtt atg tec ctg age agt gag aac 1153 
Ser Glu Glu Gly Leu Gly Ala Ser Val Met Ser Leu Ser Ser Glu Asn 
370 375 380 

ccc tat get acc ate cga gac ctg ccc age ctg cct ggg gaa ccc cga 1201 
Pro Tyr Ala Thr He Arg Asp Leu Pro Ser Leu Pro Gly Glu Pro Arg 
385 390 395 400 

gaa agt ggc tat gtg gag atg aaa gga cct cca tea gtg tec cct ccc 1249 
Glu Ser . Gly Tyr Val Glu Met Lys Gly Pro Pro Ser Val Ser Pro Pro 
405 410 415 

agg cag tct ctt cat etc egg gac agg cag cag egg caa ctg cag cca 1297 
Arg Gin Ser Leu His Leu Arg Asp Arg Gin Gin Arg Gin Leu Gin Pro 
420 425 430 

cag agg gac age ggc acc tat gag cag ccc age ccc ttg age cat aat 1345 
Gin Arg Asp Ser Gly Thr Tyr Glu Gin Pro Ser Pro Leu Ser His Asn 
435 440 445 

gaa gag tct ttg ggc tec aeg ccc ccg ctt cct cca ggc ctg cct cet 1393 
Glu Glu Ser Leu Gly Ser Thr Pro Pro Leu Pro Pro Gly Leu Pro Pro 
450 455 460 
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ggt cac tac gac tec ccc aag aac age cat ate cct gga cac tat gac 1441 
Gly His Tyr Asp Ser Pro Lys Asn Ser His lie Pro Gly His Tyr Asp 
465 470 475 480 

ttg cct cca gta egg cat cct cca tec cct cca tec egg cgc cag gac 1489 
Leu Pro Pro Val Arg His Pro Pro Ser Pro Pro Ser Arg Arg Gin Asp 
485 490 495 

cgc tgaagagccg gcatggtatg ggagcgtgcc tatgtacctt gccaggagca 1542 
Arg 



gggactggac cagcaggcca cgaacagaaa cacttggtga agtgaacaga gaeggactgt 1602 

ggccctgtgc ttccaccgag ggagacacta gttgacaaag tgtctaaccc tettttccaa 1662 

cccactgctc aagtccctgt ggacataagc tggtgggcag aatgttgttg tacaagt^tg 1722 

attttagatc gatt.tttttt taaagtatgt gttgggtacc ttttctgtgt gtatgctcag 1782 

gcaggctgtg tgtgtctcta gttggcttta gagggagtca ggtataggtt ctgccttctg 1842 

cactttccat cttatctagt agtcagcttc caagcttaac tagttagagc tccaccagca 1902 

gcaggcccta actacctgcc tgcccttcac ccagtaatcc tccatgtctt tgctcagagg 1962 

attgctcccc gactctggtg ttgtcctcct ggtacgcctt gacggtcctg cagtctccct 2022 

ttcccgtctt gcttcattct ttcccagaat gaaggctgtc tgccacccta cttcccagcc 2082 

caggaattgg cacatctaag ttcagccttc ctaagttacc cgttgagtcc tgcttgccct 2142 

tcacatattc cacagaacac ccaccccaca tctgcttcat agctactctc ttctccacgt 2202 

acccacagaa ggcagaagtg gtaccaggea agaagatggg attgttgcat tttgttttgt 2262 

ttttgagaet ctgtctcact atgtagtcct ggctggcctg gaactcaaga gctctgcctg 2322 

cctctgcctc ttgagtgctg ggtttaacgg ctcagggtca catgcacagc tcaagctgca 2 382 

ctccgatgtg ctttcccctg ttgctagatt agcgtctgcc tccccctagt ggagaggctg 2442 

atcgccagct ctctgatgca ggactctggt gtttaggctc actcactatt ggtttccttg 2502 

geacagggta gtcactcaat aaatgttcct ctaaaagctg aaaaaaaaaa aaaaaaaggg 2562 

eggccgc 2569 

<210> 17 
<211> 497 
<212> PRT 

<213> Mus mus cuius 





<400> 


17 
























Ser 


Thr 


His 


Ala 


Ser Gly 


Asp 


Pro 


Val 


His 


Gly 


Glh 


Cys 


Arg 


Cys 


Gin 


1 








5 








10 










15 




Ala 


Gly Trp 


Met 


Gly Thr 


Arg 


Cys 


His 


Leu 


Pro 


Cys 


Pro 


Glu 


Gly 


Phe 








20 








25 










30 






Trp 


Gly Ala Asn 


Cys Ser 


Asn 


Thr 


Cys 


Thr 


Cys 


Lys 


Asn 


Gly 


Gly 


Thr 






35 








40 










45 








Cys 


Val 


Ser 


Glu 


Asn Gly 


Asn 


Cys 


Val 


Cys 


Ala 


Pro 


Gly 


Phe 


Arg 


Gly 




50 








55 










60 










Pro 


Ser 


Cys 


Gin 


Arg Pro 


Cys 


Pro 


Pro 


Gly 


Arg 


Tyr 


Gly 


Lys 


Arg 


Cys 


65 








70 










75 










80 


Val 


Gin 


Cys 


Lys 


Cys Asn 


Asn 


Asn 


His 


Ser 


Ser 


Cys 


His 


Pro 


Ser 


Asp 










85 








90 










95 




Gly 


Thr 


cys 


Ser 


Cys Leu 


Ala 


Gly 


Trp 


Thr 


Gly 


Pro 


Asp 


Cys 


Ser 


Glu 








100 








105 










110 






Ala 


Cys 


Pro 


Pro 


Gly His 


Trp 


Gly 


Leu 


Lys 


Cys 


Ser 


Gin 


Leu 


Cys 


Gin 






115 








120 










125 








Cys 


His 


His 


Gly 


Gly Thr 


Cys 


His 


Pro 


Gin 


Asp 


Gly 


Ser 


Cys 


He 


Cys 




130 








135 










140 










Thr 


Pro 


Gly 


Trp 


Thr Gly 


Pro 


Asn 


Cys 


Leu 


Glu 


Gly 


Cys 


Pro 


Pro 


Arg 


145 








150 










155 










160 
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Met 


Phe 


Gly 


vai 


Asn 


Cys 


Ser 


Gin 


Leu 


Cys 


Gin 


Cys 


Asp 


Leu 


Gly 


Glu 










165 










170 










175 




Met 


Cys 


His 


Pro 


Glu 


Thr 


Gly 


Ala Cys 


Val 


Cys 


Pro 


Pro 


Gly 


His 


Ser 








180 










185 










190 






Gly 


Ala 


Asp 


Cys 


Lys 


Met 


Gly 


Ser 


Gin 


Glu 


Ser 


Phe 


Thr 


He 


Met 


Pro 






195 










200 










205 








Thr 


Ser 


Pro 


Val 


Thr 


His 


Asn 


Ser Leu Gly 


Ala 


Val 


He 


Gly 


He 


Ala 




210 










215 










220 










Val 


Leu 


Gly 


Thr 


Leu 


Val 


Val 


Ala 


Leu 


He 


Ala 


Leu 


Phe 


He 


Gly 


Tyr 


225 










230 










235 










240 


Arg 


Gin 


Trp 


Gin 


Lys 


Gly 


Lys 


Glu 


His 


Glu 


His 


Leu 


Ala 


Val 


Ala 


Tyr 










245 










250 










255 




Ser 


Thr 


Gly 


Arg 


Leu 


Asp 


Gly 


Ser 


Asp 


Tyr 


Val 


Met 


Pro 


Asp 


Val 


Ser 








260 










265 










270 






Pro 


Ser 


Tyr 


Ser 


His 


Tyr 


Tyr 


Ser 


Asn 


Pro 


Ser 


Tyr 


His 


Thr 


Leu 


Ser 






275 










280 










285 








Gin 


Cys 


Ser 


Pro 


Asn 


Pro 


Pro 


Pro 


Pro 


Asn 


Lys 


Val 


Pro 


Gly 


Ser 


Gin 




290 










295 










300 










Leu 


Phe 


Val 


Ser 


Ser 


Gin 


Ala 


Pro Glu Arg 


Pro 


Ser 


Arg 


Ala 


His 


Gly 


305 










310 










315 










320 


Arg 


oiu 


Asn 


His 


Thr 


Thr 


Leu 


Pro Ala Asp 


Trp 


Lys 


His 


Arg 


Arg 


Glu 










325 










330 










335 




Pro 


His 


Asp 


Arg 


Gly 


Ala 


Ser 


His 


Leu Asp 


Arg 


Ser 


Tyr 


Ser 


Cys 


Ser 








340 










345 










350 






Tyr 


Ser 


His 


Arg 


Asn 


Gly 


Pro 


Gly 


Pro 


Phe 


Cys 


His 


Lys 


Gly 


Pro 


He 






355 










360 










365 








Ser 


Glu 


Glu 


Gly 


Leu 


Gly 


Ala 


Ser 


Val 


Met 


Ser 


Leu 


Ser 


Ser 


Glu 


Asn 




370 










375 










380 










Pro 


Tyr 


Ala 


Thr 


He 


Arg 


Asp 


Leu 


Pro 


Ser 


Leu 


Pro 


Gly 


Glu 


Pro 


Arg 


385 










390 










395 










400 


Glu 


Ser 


Gly 


Tyr 


Val 


Glu 


Met 


Lys 


Gly 


Pro 


Pro 


Ser 


Val 


Ser 


Pro 


Pro 










405 










410 










415 




Arg 


Gin 


Ser 


Leu 


His 


Leu 


Arg 


Asp 


Arg 


Gin 


Gin 


Arg 


Gin 


Leu 


Gin 


Pro 








420 










425 










430 






Gin 


Arg 


Asp 


Ser 


Gly 


Thr 


Tyr 


Glu 


Gin 


Pro 


Ser 


Pro 


Leu 


Ser 


His 


Asn 






435 










440 










445 








Glu 


Glu 


Ser 


Leu 


Gly 


Ser 


Thr 


Pro 


Pro 


Leu 


Pro 


Pro 


Gly 


Leu 


Pro 


Pro 




450 










455 










460 










Gly 


His 


Tyr 


Asp 


Ser 


Pro 


Lys 


Asn 


Ser 


His 


He 


Pro 


Gly 


His 


Tyr 


Asp 


465 










470 










475 










480 


Leu 


Pro 


Pro 


Val 


Arg 


His 


Pro 


Pro 


Ser 


Pro 


Pro 


Ser 


Arg 


Arg 


Gin 


Asp 










485 










490 










495 





Arg 



<210> 18 
<211> 1491 

<212> DNA 

<213> Mus musculus 
<400> 18 

tcgacccacg cgtccggtga ccctgttcat ggacagtgcc gatgtcaggc tggttggatg 60 
ggcacacgct gccacctgcc ttgcccggag ggcttttggg gagccaactg cagtaacacc 120 
tgtacctgca agaatggtgg tacctgtgtg tctgagaatg gcaactgcgt gtgcgcacca 180 
gggttccgag gcccctcctg ccagaggccc tgcccgcctg gtcgctatgg caaacgctgt 240 
gtgcaatgca agtgtaacaa caaccattct tcctgccacc catcggacgg gacctgctcc 300 
tgcctggcgg gctggacagg ccctgactgc tccgaggcat gtcccccagg ccactgggga 360 
ctcaaatgct cccaactctg ccagtgtcat catggtggga cctgccaccc ccaggatggg 420 
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agctgtatct 
atgtttggtg 
gagactgggg 
caggagtcct 
attggcattg 
cgccagtggc 
ctggatggct 
aaccccagct 
ccaggcagtc 
cgtgagaacc 
ggcgccagcc 
ccattctgtc 
agcagtgaga 
gaaagtggct 
catctccggg 
cagcccagcc 
ggcctgcctc 
ttgcctecag 



gcacgccagg 
tcaactgctc 
cttgtgtctg 
tcaccataat 
cagtactggg 
aaaagggcaa 
ctgattacgt 
accacacact 
agctctttgt 
ataccacact 
acctggaccg 
ataaaggtcc 
acccctatgc 
atgtggagat 
acaggcagca 
ccttgagcca 
ctggtcacta 
tacggcatcc 



ctggactgga 
ccagctatgt 
tcccccagga 
gcccacctct 
aaccctcgtg 
ggaacatgag 
catgccagat 
gtctcagtgt 
cagctctcag 
gcccgctgac 
aagctatagc 
catctctgaa 
taccatccga 
gaaaggacct 
gcggcaactg 
taatgaagag 
cgactccccc 
tccatcccct 



cccaactgct 
cagtgtgatc 
cacagtggtg 
cccgtgaccc 
gtggccctga 
cacttggcag 
gtctctccga 
tctcctaacc 
gcccctgagc 
tggaagcacc 
tgtagctata 
gagggactag 
gacctgccca 
ccatcagtgt 
cagccacaga 
tctttgggct 
aagaacagcc 
ccatcccggc 



tggaaggctg 
tcggagagat 
cagactgcaa 
ataactcact 
tagcactgtt 
tggcttacag 
gctatagtca 
ccccgccccc 
ggccaagcag 
gccgggagcc 
gccacaggaa 
gggcaagcgt 
gcctgcctgg 
cccctcccag 
gggacagcgg 
ccacgccccc 
atatccctgg 
gccaggaccg 



cccaccaaga 
gtgccaccca 
aatgggaagc 
gggtgcagtg 
cattggctac 
cactgggcgg 
ctactactcc 
taacaaggtc 
agcccacggg 
ccatgacaga 
tggcccagga 
tatgtccctg 
ggaaccccga 
gcagtctctt 
cacctatgag 
gcttcctcca 
acactatgac 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1491 



<210> 19 
<211> 3567 
<212> DNA 
<213> Rauttus sp* 



<220> 
<221> CDS 
<222> (925) . 



. (2632) 



<400> 19 

gtccgaccca cgcgtccgag ccacaccctg aaggtggttg gaaggaggga aggatctagg 60 

tcctgagcac tggaattccc cagaacagca tctggcttcc cagacccatg ctggccacca 120 

ctgatgtgtc cttccggctg ctggctgcag tgctgttctg ttgttgggtg ccctgtggca 180 

ggcttgtgca atgccactct gtcccctcct cctcctggcc ctaggcctgc gtctggctgg 240 

aacactcaac tccaatgatc ccaatgtctg taccttctgg gaaagcttca ccacgaccac 3 00 

taaggagtcc caccttcgcc ccttcagcct gcccccagcc gagtcctgcg acaggccctg 3 60 

ggaagacccc cacacctgcg ctcagcctac ggttgtctac cggactgtgt accgtcaggt 420 

ggtgaagatg gactcccgcc cacgcctgca gtgctgtggg ggttactacg agagcagtgg 480 

agcctgtgtc ccactctgtg cccaggagtg tgtccacggt cgctgtgtgg ctcctaatcg 540 

gtgccagtgt gcaccaggct ggcggggtga cgactgttcc agtgagtgtg ctcctggaat 60 0 

gtggggacca cagtgtgaca ggctctgcct ctgtggcaac agcagttcct gtgatcccag 660 

gagtggggtg tgtttttgcc cctctggcct gcagcccccc gactgccttc agccttgccc 72 0 

cgatggccac tatggtcctg cctgccagtt tgattgccat tgctatgggg catcctgtga 780 

cccccgggat ggagcctgct tqtgcccccc agggagaaca ggacccaggg cactgatggc 840 

ttcttctgcG ccagaactta tccttgccaa aatggaggtg ttcctcaggg ctctcaaggc 900 

tcctgcagct gcccaccggg ctgg atg ggt gtc ate tgt tec ctg cca tgc 951 

Met Gly Val He Cys Ser Leu Pro Cys 
1 5 



cca gag ggt ttc cac gga ccc aac tgt act cag gaa tgt cgt tgc cac 
Pro Glu Gly Phe His Gly Pro Asn Cys Thr Gin Glu Cys Arg Cys His 
10 15 .20 25 



999 



aat ggt ggc ctt tgt gac agg ttt act ggg cag tgc cac tgt get cct 
Asn Gly Gly Leu Cys Asp Arg Phe Thr Gly Gin Cys His Cys Ala Pro 
30 35 40 



1047 



ggc tat ate ggg gat egg tgc cgt gaa gag tgc cct gtg ggc cge ttc 
Gly Tyr He Gly Asp Arg Cys Arg Glu Glu cys Pro Val Gly Arg Phe 



1095 
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45 



SO 



55 



ggt caa gac tgt get gag acc tgt gac tgt get cct ggc get cgt tgc 1143 
Qly Gin Asp Cys Ala Glu Thr Cys Asp Cye Ala Pro Gly Ala Arg Cys 
60 65 70 

ttt cct gcc aat ggc gcg tgt ctg tgc gaa cat ggc ttc aca ggc gac 1191 
Phe Pro Ala Asn Gly Ala Cys Leu Cys Glu His Gly Phe Thr Gly Asp 
75 80 85 

cgc tgc act gag cga etc tgt cca gat ggc cgc tat ggt ctg age tgc 1239 
Arg Cys Thr Glu Arg Leu Cys Pro Asp Gly Arg Tyr Gly Leu Ser Cys 
90 95 100 105 

caa gat ccc tgc acc tgc gac cca gaa cac agt etc age tgc cac cca 1287 
Gin Asp Pro Cys Thr Cys Asp Pro Glu His Ser Leu Ser Cys His Pro 
110 lis 120 

atg cac ggc gag tgc tec tgc cag cca ggt tgg gcg ggc etc cac tgc 1335 
Met His Gly Glu Cys Ser Cys Gin Pro Gly Trp Ala Gly Leu His Cys 
125 130 135 

aac gag age tgc cct cag gac acg cac gga gcc ggt tgc cag gag cac 1383 
Asn Glu Ser Cys Pro Gin Asp Thr His Gly Ala Gly Cys Gin Glu His 
140 145 150 

tgc etc tgt ctg cac ggc ggt gtt tgc etc gcc gac age ggc etc tgc 1431 
Cys Leu Cys Leu His Gly Gly Val Cys Leu Ala Asp Ser Gly Leu Cys 
155 160 165 

egg tgt gca cct ggc tac acg gga cct cac tgc get aat ctt tgt cca 1479 
Arg Cys Ala Pro Gly Tyr Thr Gly Pro His Cys Ala Asn Leu Cys Pro 
170 175 180 185 

cct aac act tat ggg ate aac tgt tec tec cac tgc tec tgt gaa aat 1527 
Pro Asn Thr Tyr Gly He Asn Cys Ser Ser His Cys Ser Cys Glu Asn 
190 195 200 

gcc att gcc tgc tct cct gtc gac ggc acg tgc ate tgc aag. gaa ggt 1575 
Ala He Ala Cys Ser Pro Val Asp Gly Thr Cys lie Cys Lys Glu Gly 
205 210 215 

tgg cag cgt ggt aac tgc tct gtg ccc tgt ccc cct ggc acc tgg ggc 1623 
Trp Gin Arg Gly Asn Cys Ser Val Pro Cys Pro Pro Gly Thr Trp Gly 
220 225 230 

ttc agt tgc aat gee agt tgc cag tgt gee cac gag gga gtc tgc age 1671 
Phe Ser Cys Asn Ala Ser Cys Gin Cys Ala His Glu Gly Val Cys Ser 
235 240 245 

ccc caa act gga gcc tgt act tgc acc cct ggg tgg cgt ggg gtt cac 1719 
Pro Gin Thr Gly Ala Cys Thr Cys Thr Pro Gly Trp Arg Gly Val His 
250 255 260 265 

tgc caa ctt ccg tgc ccg aag gga cag ttt ggt gaa ggt tgt gcc agt 1767 
Cys Gin Leu Pro Cys Pro Lys Gly Gin Phe Gly Glu Gly Cys Ala Ser 
270 275 280 
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gtc tgt gac tgt gac cac tec gat ggc tgt gac cct gtt cat gga cac 1815 
Val Cys Asp Cys Asp His Ser Asp Gly Cys Asp Pro Val His Gly His 
28S 290 29S 

tgc cga tgt cag get ggc tgg atg ggc aca cgt tgc cac ctg cct tgc 1863 
Cys Arg Cys Gin Ala Gly Trp Met Gly Thr Arg Cys His Leu Pro Cys 
300 305 310 

cca gag ggc ttt tgg gga gcc aac tgc age aat gcc tgt acc tgc aag 1911 
Pro Glu Gly Phe Trp Gly Ala Asn Cys Ser Asn Ala Cys Thr Cys Lys 
315 320 325 

aat ggt ggc act tgt gta cct gag aac ggc aac tgt gtg tgc gca cca 1959 
Asn Gly Gly Thr Cys Val Pro Glu Asn Gly Asn Cys Val Cys Ala Pro 
330 335 340 345 

ggg ttc aga ggc ccc tec tgc cag agg ccc tgc.ccg cct ggt cgc tat 2007 
Gly Phe Arg Gly Pro Ser Cys Gin Arg Pro Cys Pro Pro Gly Arg Tyr . 

350 355 360 

ggc aaa cgc tgt gtg ccc tgc aag tgc aac aac cat tct tec tgc cac 2055 
Gly Lys Arg Cys Val Pro Cys Lys Cys Asn Asn His Ser Ser Cys His 
365 370 375 

ccg teg gat ggg acc tgc tec tgc ctg gca ggc tgg aca ggc cct gac 2103 
Pro Ser Asp Gly Thr Cys Ser Cys Leu Ala Gly Trp Thr Gly Pro Asp 
380 385 390 

tgc tct gaa tea tgt ccc cca ggc cac tgg gga etc aaa tgc tec caa 2151 
Cys Ser Glu Ser Cys Pro Pro Gly His Trp Gly Leu Lys Cys Ser Gin 
395 400 405 

ccc tgc cag tgt cat cat ggt gcc acc tgc cac ccc cag gat ggg age 2199 
Pro Cys Gin Cys His His Gly Ala Thr Cys His Pro Gin Asp Gly Ser 
410 415 420 425 

tgt gtc tgc ate cca ggc tgg act gga ccc aac tgc teg gaa ggc tgc 2247 
Cys Val Cys He Pro Gly Trp Thr Gly Pro Asn Cys Ser Glu Gly Cys 
430 435 440 

cca tea aga atg ttt ggt gtc aac tgc tec cag eta tgt cag tgt gat 2295 
Pro Ser Arg Met Phe Gly Val Asn Cys Ser Gin Leu Cys Gin Cys Asp 
445 450 455 

cct gga gag atg tgc cac cca gag act ggg get tgc gtc tgt ccc cca 2343 
Pro Gly Glu Met Cys His Pro Glu Thr Gly Ala Cys Val Cys Pro Pro 
460 465 470 

gga cac agt ggt gcg cac tgc aaa gtg ggc age cag gag tec ttc acc 23 91 

Gly His Ser Gly Ala His Cys Lys Val Gly Ser Gin Glu Ser Phe Thr 
475 480 485 

ata atg ccc acc tct cct gtg ate eat aac tea ctg ggt gcc gtg att 2439 
He Met Pro Thr Ser Pro Val He His Asn Ser Leu Gly Ala Val He 
490 495 500 505 

ggc att gca gtg ctg ggg acc ett gtg gtg gcc ctg gta gca ctg ttt 24 87 

Gly He Ala Val Leu Gly Thr Leu Val Val Ala Leu Val Ala Leu Phe 
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510 



515 



520 



att ggc tac cga cac tgg caa aag ggc aag gaa cat gag cac ttg gca 
lie Gly Tyr Arg His Trp Gin Lys Gly Lys Glu His Glu His Leu Ala 
525 530 535 



2535 



gtg get tac age act ggg cga ctg gat ggc tec gat tac gtc atg cca 
Val Ala Tyr Ser Thr Gly Arg Leu Asp Gly Ser Asp Tyr Val Met Pro 

540 545- 550 



2583 



gat gtc tct ccg age tac agt cac tac tat tec aac cct age tac cac 
Asp Val Ser Pro Ser Tyr Ser His Tyr Tyr Ser Asn Pro Ser Tyr His 
555 560 565 



2631 



aca ctg tct cag tgt tct cct aac cct cca cec cct aac aag att cca 
Thr Leu Ser Gin Cys Ser Pro Asn Pro Pro Pro Pro Asn Lys lie Pro 
570 575 580 585 



2679 



ggc agt cag ctg ttt gtc age tec cag gca tct gag egg cca aac aga 
Gly Ser Gin Leu Phe Val Ser Ser Gin Ala Ser Glu Arg Pro Asn Arg 
590 595 600 



2727 



aac cat ggg cga gat aac cac gee aca ctg cec get gac tgg aag cac 
Asn His Gly Arg Asp Asn His Ala Thr Leu Pro Ala Asp Trp Lys His 
605 610 615 



2775 



cga egg gag tec cat gac aga get ttc etc agg cac cag cca cct gga 
Arg Arg Glu Ser His Asp Arg Ala Phe Leu Arg His Gin Pro Pro Gly 
620 625 630 



2823 



ccg aag gta tagctgtagc tatggeeaca ggaatggccc ggggccattc 
Pro Lys Val 
635 



2872 



tgtcataaag 
gagaacccct 
agctatgtgg 
cgggacaggc 
actccettga 
ccacccggcc 
ccagtacggc 
gtatgggaga 
agacatactt 
ctagttggca 
ggacatgagc 
aaaaaaaaaa 



gtcccatctc 
atgcgaccat 
agatgaaagg 
ageagcagca 
gccgtaatga 
actatgactc 
atcctccate 
gtgcctgtga 
ggtgaagtga 
aagtgtctaa 
tggtgggcag 
aaaaaaaaaa 



tgaagaagga 
ccgagacctg 
ccctccatca 
actgcagtct 
agagtctgtg 
gcccaaaaac 
acctccatcc 
accctgecag 
acggagactg 
cctccctttt 
aatgttgttg 
aaa.aagggcg 



ctaggggcaa 
cccggcctgc 
gtgtctcccc 
cagagagaca 
ggctccatgc 
agecacatcc 
cggcgccagg 
gagcagggcc 
aggatggcte 
ccagcecatt 
ttgaagtctg 
gcege 



gcgttatgtc 
ctggggaacc 
ccaggcagcc 
gcggcaccta 
cccetcttcc 
ctggacacta 
accgctgagg 
tggaccagca 
tgcttccace 
gctcaagtcc 
attttagatt 



cctgagcagt 
ccgagaaagc 
tcttcatctc 
tgageagccc 
tccgggcctg 
tgacttgcct 
agecagcatg 
ggceatgaat 
gagggagaca 
cccaggctgt 
gattttttaa 



2932 
2992 
3052 
3112 
3172 
3232 
3292 
3352 
3412 
3472 
3532 
3567 



<210> 20 
<211> 636 
<212> PRT 
<213> Rauttus 



sp. 



<400> 20 

Met Gly Val lie Cys Ser Leu Pro Cys Pro Glu Gly Phe His Gly Pro 

15 10 15 

Asn Cys Thr Gin Glu Cys Arg Cys His Asn Gly Gly Leu Cys Asp Arg 

20 25 30 

Phe Thr Gly Gin Cys His Cys Ala Pro Gly Tyr He Gly Asp Arg Cys 
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35 




40 








45 








Arg Glu Glu 


Cys Pro 


Val Gly Arg Phe 


Gly 


Gin 


Asp 


Cys 


Ala 


Glu 


Thr 


50 




55 






60 










Cys Asp Cys Ala Pro Gly Ala Arg Cys 


Phe 


Pro 


Ala 


Asn 


Gly 


Ala 


Cys 


65 




70 




75 










80 


Leu Cys Glu His Gly Phe Thr Gly Asp 


Arg 


Cys 


Thr 


Glu 


Arg 


Leu 


Cys 




85 




90 










95 




Pro Asp Gly Arg Tyr Gly Leu Ser Cys 


Gin 


Asp 


Pro 


Cys 


Thr 


Cys 


Asp 




100 


105 










110 






Pro Glu His 


Ser Leu 


Ser Cys His Pro 


Met 


His 


Gly 


Glu 


Cys 


Ser 


Cys 


115 




120 








125 








Gin Pro Gly Trp Ala Gly Leu His Cys 


Asn 


Glu 


Ser 


Cys 


Pro 


Gin 




130 




135 






140 










Thr His Gly Ala Gly 


Cys Gin Glu His 


Cys 


Leu 


Cys 


Leu 


His 


Gly 


Gly 


145 




ISO 




155 










160 


Val Cys Leu 


Ala Asp 


Ser Gly Leu Cys 


Arg 


Cys 


Ala 


Pro 


Gly 


tyr 


Thr 




165 




170 










175 




Gly Pro His 


Cys Ala 


Asn Leu Cys Pro 


Pro 


Asn 


Thr 


Tyr 


Gly 


He 


Asn 




180 


185 










190 






Cys Ser ser 


His Cys 


Ser Cys Glu Asn 


Ala 


He 


Ala 


Cys 


Ser 


Pro 


Val 


19S 




200 








205 








Asp Gly Thr 


Cys He 


Cys Lys Glu Gly 


Trp 


Gin 


Arg 


Gly 


Asn 


Cys 


Ser 


210 




215 






220 










Val Pro Cys 


Pro Pro 


Gly Thr Trp Gly 


Phe 


Ser 


Cys 


Asn 


Ala 


Ser 


Cys 


225 




230 




235 










240 


Gin Cys Ala 


His Glu 


Gly Val Cys Ser 


Pro 


Gin 


Thr 


Gly 


Ala 


Cys 


Thr 




245 




250 










255 




Cys Thr Pro 


Gly Trp 


Arg Gly Val His 


Cys 


Gin 


Leu 


Pro 


Cys 


Pro 


Lys 




260 


265 










270 






Gly Gin Phe 


Gly Glu 


Gly Cys Ala Ser 


Val 


Cys 


Asp 


Cys 


Asp 


His 


Ser 


275 




280 








285 








Asp Gly Cys 


Asp Pro 


Val His Gly His 


Cys 


Arg 


Cys 


Gin 


Ala 


Gly 


Trp 


290 




295 






300 










Met Gly Thr Arg Cys 


His Leu Pro Cys 


Pro 


Glu 


Gly 


Phe 


Trp 


Gly 


Ala 


305 




310 




315 










320 


Asn Cys Ser 


Asn Ala 


Cys Thr Cys Lys 


Asn 


Gly 


Gly 


Thr 


Cys 


Val 


Pro 




325 




330 










335 




Glu Asn Gly Asn Cys 


Val Cys Ala Pro 


Gly 


Phe 


Arg 


Gly 


Pro 


Ser 


Cys 




340 


345 










350 






Gin Arg Pro 


Cys Pro 


Pro Gly Arg Tyr 


Gly 


Lys 


Arg 


Cys 


Val 


Pro 


Cys 


355 




360 








365 








Lys Cys Asn 


Asn His 


Ser Ser Cys His 


Pro 


Ser 


Asp 


Gly 


Thr 


Cys 


Ser 


370 




375 






380 










Cys Leu Ala Gly Trp Thr Gly Pro Asp 


Cys 


Ser 


Glu 


Ser 


Cys 


Pro 


Pro 


385 




390 




395 










400 


Gly His Trp Gly Leu 


Lys Cys Ser Gin 


Pro 


Cys 


Gin 


Cys 


His 


His 


Gly 




405 




410 










415 




Ala Thr Cys 


His Pro Gin Asp Gly Ser 


Cys 


Val 


Cys 


He 


Pro 


Gly 


Trp 




420 


425 










430 






Thr Gly Pro 


Asn Cys 


Ser Glu Gly Cys 


Pro 


Ser 


Arg 


Met 


Phe 


Gly 


Val 


435 




440 








445 








Asn Cys Ser 


Gin Leu 


Cys Gin Cys Asp 


Pro 


Gly 


Glu 


Met 


Cys 


His 


Pro 


450 




455 






460 










Glu Thr Gly Ala Cys 


Val Cys Pro Pro 


Gly 


His 


Ser 


Gly 


Ala 


His 


Cys 


465 




470 




475 










480 


Lys Val Gly 


Ser Gin 


Glu Ser Phe Thr 


He 


Met 


Pro 


Thr 


Ser 


Pro 


Val 




485 




490 










495 




lie His Asn 


Ser Leu Gly Ala Val He 


Gly 


He 


Ala 


Val 


Leu 


Gly 


Thr 
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500 










505 










510 






Leu 


Val 


Val 
515 


Ala 


Leu 


Val 


Ala 


Leu 
520 


Phe 


He 


Gly 


Tyr 


Arg 
525 


His 


Trp 


Gin 


Lys 


Gly 

530 


Lys 


Glu 


His 


Glu 


His 
535 


Leu 


Ala 


Val 


Ala 


Tyr 
540 


Ser 


Thr 


Gly 


Arg 


Leu 


Asp 


Gly 


Ser 


Asp 


Tyr 


Val 


Met 


Pro 


Asp 


Val 


Ser 


Pro 


Ser 


Tyr 


Ser 


545 










550 










555 










560 


His 


Tyr 


Tyr 


Ser 


Asn 
565 


Pro 


Ser 


Tyr 


His 


Thr 
570 


Leu 


Ser 


Gin 


Cys 


Ser 
575 


Pro 


Asn 


Pro 


Pro 


Pro 
580 


Pro 


Asn 


Lys 


lie 


Pro 
585 


Gly 


Ser 


Gin 


Leu 


Phe 
590 


Val 


Ser 


Ser 


Gin 


Ala 
595 


Ser 


Glu 


Arg 


Pro 


Asn 
600 


Arg 


Asn 


His 


Gly 


Arg 
605 


Asp 


Asn 


His 


Ala 


Thr 
610 


Leu 


Pro 


Ala 


Asp 


Trp 
615 


Lys 


His 


Arg 


Arg 


Glu 
620 


Ser 


His 


Asp 


Arg 


Ala 


Phe 


Leu 


Arg 


His. 


Gin 


Pro 


Pro 


Gly 


Pro 


Lys 


Val 










625 










630 










635 













<210> 21 
<211> 1908 
<212> DNA 
<213> RattUS Sp. 

<400> 21 

atgggtgtca tctgttccct gccatgccca gagggtttcc acggacccaa ctgtactcag 60 

gaatgtcgtt gccacaatgg tggcctttgt gacaggttta ctgggcagtg ccactgtgct 12 0 

cctggctata tcggggatcg gtgccgtgaa gagtgccctg tgggccgctt cggtcaagac 180 

tgtgctgaga cctgtgactg tgctcctggc gctcgttgct ttcctgccaa tggcgcgtgt 24 0 

ctgtgcgaac atggcttcac aggcgaccgc tgcactgagc gactctgtcc agatggccgc 300 

tatggtctga gctgccaaga tccctgcacc tgcgacccag aacacagtct cagctgccac 360 

ccaatgcacg gcgagtgctc ctgccagcca ggttgggcgg gcctccactg caacgagagc 420 

tgccctcagg acacgcacgg agccggttgc caggagcact gcctctgtct gcacggcggt 4 80 

gtttgcctcg ccgacagcgg cctctgccgg tgtgcacctg gctacacggg acctcactgc 540 

gctaatcttt gtccacctaa cacttatggg atcaactgtt cctcccactg ctcctgtgaa 600 

aatgccattg cctgctctcc tgtcgacggc acgtgcatct gcaaggaagg ttggcagcgt 660 

ggtaactgct ctgtgccctg tccccctggc acctggggct tcagttgcaa tgccagttgc 72 0 

cagtgtgccc acgagggagt ctgcagcccc caaactggag cctgtacttg cacccctggg 7 80 

tggcgtgggg ttcactgcca acttccgtgc ccgaagggac agtttggtga aggttgtgcc 84 0 

agtgtctgtg actgtgacca ctccgatggc tgtgaccctg ttcatggaca ctgccgatgt 900 

caggctggct ggatgggcac acgttgccac ctgccttgcc cagagggctt ttggggagcc 960 

aactgcagca atgcctgtac ctgcaagaat ggtggcactt gtgtacctga gaacggcaac 1020 

tgtgtgtgcg caccagggtt cagaggcccc tcctgccaga ggccctgccc gcctggtcgc 1080 

tatggcaaac gctgtgtgcc ctgcaagtgc aacaaccatt cttcctgcca cccgtcggat 1140 

gggacctgct cctgcctggc aggctggaca ggccctgact gctctgaatc atgtccccca 1200 

ggccactggg gactcaaatg ctcccaaccc tgccagtgtc atcatggtgc cacctgccac 1260 

ccccaggatg ggagctgtgt ctgcatccca ggctggactg gacccaactg ctcggaaggc 1320 

tgcccatcaa gaatgtttgg tgtcaactgc tcccagctat gtcagtgtga tcctggagag 1380 

atgtgccacc cagagactgg ggcttgcgtc tgtcccccag gacacagtgg tgcgcactgc 1440 

aaagtgggca gccaggagtc cttcaccata atgcccacct ctcctgtgat ccataactca 1500 

ctgggtgccg tgattggcat tgcagtgctg gggacccttg tggtggccct ggtagcactg 1560 

tttattggct accgacactg gcaaaagggc aaggaacatg agcacttggc agtggcttac 162 0 

agcactgggc gactggatgg ctccgattac gtcatgccag atgtctctcc gagctacagt 1680 

cactactatt ccaaccctag ctaccacaca ctgtctcagt gttctcctaa ccctccaccc 1740 

cctaacaaga ttccaggcag tcagctgttt gtcagctccc aggcatctga gcggccaaac 1800 

agaaaccatg ggcgagataa ccacgccaca ctgcccgctg actggaagca ccgacgggag 1860 

tcccatgaca gagctttcct caggcaccag ccacctggac cgaaggta 1908 

<210> 22 



31 



BNSDOCID: <W0 0100673A1J„> 



<211> 1497 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (217) . . . (684) 
<400> 22 

gtcgacccac gcgtccggct cccagcccac ccccaaacag acacagcgta gcccgggcca 60 
gctcttaagg agttcaggag tgagaagagg ccctcagaga tctgacagcc taggagtgcg 12 0 
tggacaccac ctcagcccac tgagcaggag tcacagcacg aagaccaagc gcaaagcgac 18 0 
ccctgccctc catcctgact gctcctccta agagag atg gca ccg gcc aga gca 234 

Met Ala Pro Ala Arg Ala 
1 5 



gga ttc tgc ccc ctt ctg ctg ctt ctg ctg ctg ggg ctg tgg gtg gca 
Gly Phe Cys Pro Leu Leu Leu Leu Leu Leu Leu Gly Leu Trp Val Ala 
10 15 20 



282 



gag ate cca gtc agt gcc aag ccc aag ggc atg ace tea tea cag tgg 
Glu He Pro Val Ser Ala Lys Pro Lys Gly Met Thr Ser Ser Gin Trp 
25 30 35 



330 



ttt aaa att cag cac atg cag ccc age cct caa gca tgc aac tea gcc 
Phe Lys He Gin His Met Gin Pro Ser Pro Gin Ala Cys Asn Ser Ala 
40 45 50 



378 



atg aaa aac att aac aag cac aca aaa egg tgc aaa gac etc aac acc 
Met Lys Asn He Asn Lys His Thr Lys Arg Cys Lys Asp Leu Asn Thr 
55 60 65 70 



426 



ttc ctg cac gag cct ttc tec agt gtg gcc gcc acc tgc cag acc ccc 
Phe Leu His Glu Pro Phe Ser Ser Val Ala Ala Thr Cys Gin Thr Pro 
75 80 85 



474 



aaa ata gcc tgc aag aat ggc gat aaa aac tgc cac cag age cac ggg 
Lys He Ala Cys Lys Asn Gly Asp Lys Asn Cys His Gin Ser His Gly 
90 95 100 



522 



ccc gtg tec ctg acc atg tgt aag etc acc tea ggg aag tat ccg aac 
Pro Val Ser Leu Thr Met Cys Lys Leu Thr Ser Gly Lys Tyr Pro Asn 
105 110 115 



570 



tgc agg tac aaa gag aag cga cag aac aag tct tac gta gtg gcc tgt 
Cys Arg Tyr Lys Glu Lys Arg Gin Asn Lys Ser Tyr Val Val Ala Cys 
120 125 130 



618 



aag cct ccc cag aaa aag gac tct cag caa ttc cac ctg gtt cct gta 
Lys Pro Pro Gin Lys Lys Asp Ser Gin Gin Phe His Leu Val Pro Val 
135 140 145 150 



666 



cac ttg gac aga gtc ctt taggtttcca gactggettg ctctttggct 
His Leu Asp Arg val Leu 
155 

gaectteaat tccctetcea ggactccgea ecactccect acacccagag eattetcttc 
cectcatctc ttggggetgt tcctggttea gcctctgctg ggaggctgaa gctgacactc 



714 



774 
834 
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tggtgagctg 
ttatccccaa 
gtGtcccetg 
ggcatatggg 
ctgacgtggc 
ttgtatagaa 
atggatgagg 
tagtagcaga 
ctctaccagt 
agttcactca 
aaggtcatta 
aaa 



agctctagag 
gaaacagcaa 
ccccctggca 
atttgtggac 
agtgaggtga 
tcctctaatc 
aaattaaggt 
acctggactt 
tgcgcaagaa 
tgaagaaacg 
cctctctagc 



ggatggcttt 
gctcaggtct 
ttagggcagc 
acagctgttt 
cctgaaggaa 
ccttgtgaca 
tttagaaagc 
gaacctaggt 
agaagtcact 
agtgctctga 
caaaaaaaaa 



tcatcttttt 
^tgggttccc 
atgacaagga 
ctgttcctga 
agaaaaataC 
tagacttgac 
ttaatgaatt 
ctccttgctc 
gttacagagg 
agagccagtt 
aaaaaaaaaa 



gttgctgttt 
tggtctatgc 
gaggaaataa 
actagaagtc 
aaataaatac 
agggattgta 
aaagagcttg 
caaatacagt 
caagcggtga 
accctgtgtt 
aaaaaaaaaa 



tcccagatgc 
cattgcacat 
atggaaaggg 
ttccccagct 
cacttcatat 
tgccttcttt 
tctaattagt 
gtaccttcta 
actaggtaag 
ggctgcaata 
aaaaaaaaaa 



894 

1014 
1074 
1134 
1194 
1254 
1314 
1374 
1434 
1494 
1497 



<210> 23 
<211> 156 
<212> PRT 

<213> Homo sapiens 



<400> 23 



Met 


Ala 


Pro 


Ala 


Arg 


Ala 


Gly Phe 


1 








5 








Leu 


Gly 


Leu 


Trp 


Val 


Ala 


Glu 


lie 








20 










Met 


Thr 


Ser 


Ser 


Gin 


Trp 


Phe 


Lys 






35 










40 


Gin 


Ala 


Cys 


Asn 


Ser 


Ala 


Met 


Lys 














55 




Cys 


Lys 


Asp 


Leu 


Asn 


Thr 


Phe 


Leu 


65 










70 






Ala 


Thr 


Cys 


Gin 


Thr 


Pro 


Lys 


lie 










85 








Cys 


His 


Gin 


Ser 


His 


Gly 


Pro 


Val 








100 










Ser 


Gly 


Lys 


Tyr 


Pro 


Asn 


Cys 


Arg 






115 










120 


Ser 


Tyr 


Val 


Val 


Ala 


Cys 


Lys 


Pro 




130 










135 




Phe 


His 


Leu 


Val 


Pro 


Val 


His 


Leu 


145 










150 







Cys 


Pro 


Leu 


Leu 


Leu 


Leu 


Leu 


Leu 




10 










15 




Pro 


Val 


Ser 


Ala 


Lys 


Pro 


Lys 


Gly. 


25 










30 






He 


Gin 


His 


Met 


Gin 


Pro 


ser 


Pro 










45 








Asn 


He 


Asn 


Lys 


His 


Thr 


Lys 


Arg 








60 










His 


Glu 


Pro 


Phe 


Ser 


Ser 


Val 


Ala 






75 










80 


Ala 


Cys 


Lys 


Asn 


Gly 


Asp 


Lys 


Asn 




90 










95 




Ser 


Leu 


Thr 


Met 


Cys 


Lys 


Leu 


Thr 


105 










110 






Tyr 


Lys 


Glu 


Lys 


Arg 


Gin 


Asn 


Lys 










125 








Pro 


Gin 


Lys 


Lys 


Asp 


Ser 


Gin 


Gin 








14 0 










Asp Arg Val 


Leu 














155 













<210> 24 

<211> 468 

<212> DNA 

<213> Homo sapiens 



<400> 24 

atggcaccgg ccagagcagg attctgcccc cttctgctgc ttctgctgct ggggctgtgg SO 

gtggcagaga tcccagtcag tgccaagccc aagggcatga cctcatcaca gtggtttaaa 120 

attcagcaca tgcagcccag ccctcaagca tgcaactcag ccatgaaaaa cattaacaag 180 

cacacaaaac ggtgcaaaga cctcaacacc ttcctgcacg agcctttctc cagtgtggcc 240 

gccacctgcc agacccccaa aatagcctgc aagaatggcg ataaaaactg ccaccagagc 3 00 

cacgggcccg tgtccctgac catgtgtaag ctcacctcag ggaagtatcc gaactgcagg 3 60 

tacaaagaga agcgacagaa caagtcttac gtagtggcct gtaagcctcc ccagaaaaag 420 

gactctcagc aattccacct ggttcctgta cacttggaca gagtcctt 468 

<210> 25 
<211> 1788 
<212> DNA 
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<213> Homo sapiens 



<220> 

<221> CDS 

<222> (62) . , . (976) 

<400> 25 

9tcgacccac ggcgtccggc caggctccac tgaggggaac ggggacctgt ctgaagagaa 60 
g atg ccc ctg ctg aca etc tac ctg etc etc ttc tgg etc tea ggc tac 109 
.Met Pro Leu Leu Thr Leu Tyr Leu Leu Leu Phe Trp Leu Ser Gly Tyr 

1 5 10 15 

tec att gee act caa ate ace ggt cca aca aca gtg aat ggc ttg gag 157 

Ser lie Ala Thr Gin He Thr Gly Pro Thr Thr Val Asn Gly Leu Glu 

20 25 30 

egg ggc tec ttg ace gtg cag tgt gtt tac aga tea ggc tgg gag acc 205 

Arg Gly Ser Leu Thr Val Gin Cys Val Tyr Arg Ser Gly Trp Glu Thr 

35 40 45 

tac ttg aag tgg tgg tgt cga gga get att tgg cgt gac tge aag ate 253 

Tyr Leu Lys Trp Trp Cys Arg Gly Ala He Trp Arg Asp Cys Lys He 

50 55 60 

ctt gtt aaa acc agt ggg tea gag cag gag gtg aag agg gac egg gtg 301 

Leu Val Lys Thr Ser Gly Ser Glu Gin Glu Val Lys Arg Asp Arg Val 

65 70 75 80 

tec ate aag gac aat cag aaa aac cgc aeg ttc act gtg acc atg gag 349 

Ser He Lys Asp Asn Gin Lys Asn Arg Thr Phe Thr Val Thr Met Glu 

85 90 95 

gat etc atg aaa act gat get gac act tac tgg tgt gga att gag aaa 397 

Asp Leu Met Lys Thr Asp Ala Asp Thr Tyr Trp Cys Gly He Glu Lys 

100 105 110 

act gga aat gac ctt ggg gtc aca gtt caa gtg acc att gac cca gcg 445 

Thr Gly Asn Asp Leu Gly Val Thr Val Gin Val Thr He Asp Pro Ala 

115 120 125 

teg act cct gee ccc acc aeg cct act tec act acg ttt aca gca cca 493 

Ser Thr Pro Ala Pro Thr Thr Pro Thr Ser Thr Thr Phe Thr Ala Pro 

130 135 140 

gtc acc caa gaa gaa act age age tec cca act ctg acc ggc cac cac 541 

Val Thr Gin Glu Glu Thr Ser Ser Ser Pro Thr Leu Thr Gly His His 

145 150 155 160 

ttg gac aac agg cac aag etc . ctg aag etc agt gtc etc ctg ccc etc 589 

Leu Asp Asn Arg His Lys Leu Leu Lys Leu Ser Val Leu Leu Pro Leu 

165 170 175 

ate ttc acc ata ttg ctg ctg ctt ttg gtg gee gee tea etc ttg get 637 

He Phe Thr He Leu Leu Leu Leu Leu Val Ala Ala Ser Leu Leu Ala 

180 185 190 

tgg agg atg atg aag tac cag cag aaa gca gee ggg atg tec cca gag 685 

Trp Arg Met Met Lys Tyr Gin Gin Lys Ala Ala Gly Met Ser Pro Glu 
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195 



200 



205 



cag gta ctg cag ccc ctg gag ggc gac etc tgc tat gca gac ctg acc 733 
Gin Val Leu Gin Pro Leu Glu Gly Asp Leu Cys Tyr Ala Asp Leu Thr 
210 215 220 

Ctg cag ctg gcc gga acc tec ccg cga aag get acc acg aag ctt tec 781 
Leu Gin Leu Ala Gly Thr Ser Pro Arg Lys Ala Thr Thr Lys Leu Ser 
225 230 235 240 

tct gcc cag gtt gac cag gtg gaa gtg gaa tat gtc acc atg get tec 82 9 

Ser Ala Gin Val Asp Gin Val Glu Val Glu Tyr Val Thr Met Ala Ser 
245 250 255 

ttg ccg aag gag gac att tec tat gca tct ctg acc ttg ggt get gag 877 
Leu Pro Lys Glu Asp lie Ser Tyr Ala Ser Leu Thr Leu Gly Ala Glu 
260 265 270 

gat cag gaa ccg acc tac tgc aac atg ggc cac etc agt age cac etc 925 
Asp Gin Glu Pro Thr Tyr Cys Asn Met Gly His Leu Ser Ser His Leu 
275 280 285 

ccc ggc agg ggc cct gag gag ccc acg gaa tac age acc ate age agg 973 
Pro Gly Arg Gly Pro Glu Glu Pro Thr Glu Tyr Ser Thr lie Ser Arg 
290 295 300 

cct tagcctgcac tccaggctcc ttcttggacc ccaggctgtg ageacactcc 1026 

Pro 

305 

tgcctcatcg accgtctgcc ccctgctccc ctcatcagga ccaacccggg gactggtgcc 1086 

tctgectgat cagccagcat tgcccctagc tctgggttgg gcttggggcc aagtctcagg 114 6 

gggcttetag gagttggggt tttctaaacg tcccctcctc tcctacatag ttgaggaggg 1206 

ggctagggat atgctctggg gctttcatgg gaatgatgaa gatgataatg agaaaaatgt 1266 

tatcattatt atcatgaagt accattatca taataeaatg aacetttatt tattgcctac 1326 

cacatgttat gggctgaata atggecccca aagatatetg tgtcetaate cteagaactt 13 86 

gtgactgtta ccttctgtgg eagaaaggga cagtgeagat gtatgtaagt taaggacttt 1446 

gagatagaga ggttattett gctgattcag gtgggcecaa aatatcaeca caagggtcct 1506 

cataagaaag aggccagaag gtcaaagagg tagagacaaa gtgatgatgg aagtggacgt 1566 

gggtgtgacg tgagcagggg ccatgaatgc cgcagccttc agatgccaga aagggaaagg 1626 

aatggattcc cctgcctgga gcctccaaaa gaaaccagcc etgcccacgc cttgacttga 1686 

gcccattgaa actgatcttg agctectggc ctccagaatt gcaggagaat aaatttgtgt 1746 

tgtttttaaa aaaaaaaaaa aaaaaaaagg gcggccgcta ga 1788 

<210> 26 
<i211> 305 
<212> PRT 

<213> Homo sapiens 

<400> 26 

Met Pro Leu Leu Thr Leu Tyr Leu Leu Leu Phe Trp Leu Ser Gly Tyr 

1 5 10 15 

Ser He Ala Thr Gin He Thr Gly Pro Thr Thr Val Asn Gly Leu Glu 

20 25 30 

Arg Gly Ser Leu Thr Val Gin Cys Val Tyr Arg Ser Gly Trp Glu Thr 

35 40 45 

Tyr Leu Lys Trp Trp Cys Arg Gly Ala He Trp Arg Asp Cys Lys He 
50 55 60 
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Leu 


Val 


Lys 


Thr 


Ser 


Gly Ser 


Glu 


Gin 


Glu 


Val 


Lys 


Arg 


Asp 


Arg 


Val 


65 










70 










75 










60 


Ser 


He 


Lys 


Asp 


Asn 
85 


Gin 


Lys 


Asn 


Arg 


Thr 
90 


Phe 


Thr 


Val 


Thr 


Met 
95 


Glu 


Asp 


Leu 


Met 


Lys 
100 


Thr 


Asp 


Ala 


Asp 


Thr 
105 


Tyr 


Trp 


Cys 


Gly 


He 

110 


Glu 


Lys 


Thr 


Gly 


Asn 


Asp 


Leu 


Gly Val 


Thr 


Val 


Gin 


Val 


Thr 


He 


Asp 


Pro 


Ala 






115 










120 










125 








Ser 


Thr 
130 


Pro 


Ala 


Pro 


Thr 


Thr 
135 


Pro 


Thr 


Ser 


Thr 


Thr 
140 


Phe 


Thr 


Ala 


Pro 


Val 


Thr 


Gin 


Glu 


Glu 


Thr 


Ser 


Ser 


Ser 


Pro 


Thr 


Leu 


Thr 


Gly 


His 


His 


145 










150 










155 










160 


Leu 


Asp 


Asn 


Arg 


His 
165 


Lys 


Leu 


Leu 


Lys 


Leu 
170 


Ser 


val 


Leu 


Leu 


Pro 
175 


Leu 


He 


Phe 


Thr 


He 
180 


Leu 


Leu 


Leu 


Leu 


Leu 
185 


Val 


Ala 


Ala 


Ser 


Leu 
190 


Leu 


Ala 


Trp 


Arg 


Met 
195 


Met 


Lys 


Tyr 


Gin 


Gin 
200 


Lys 


Ala 


Ala 


Gly 


Met 
205 


Ser 


Pro 


Glu 


Gin 


Val 
210 


Leu 


Gin 


Pro 


Leu 


Glu 
215 


Gly 


Asp 


Leu 


Cys 


Tyr 
220 


Ala 


Asp 


Leu 


Thr 


Leu 


Gin 


Leu 


Ala Gly 


Thr 


Ser 


Pro 


Arg 


Lys 


Ala 


Thr 


Thr 


Lys 


Leu 


Ser 


225 










230 










235 










240 


Ser 


Ala 


Gin 


Val 


Asp 
245 


Gin 


Val 


Glu 


Val 


Glu 
250 


Tyr 


Val 


Thr 


Met 


Ala 
255 


Ser 


Leu 


Pro 


Lys 


Glu 
260 


Asp 


He 


Ser 


Tyr 


Ala 
265 


Ser 


Leu 


Thr 


Leu 


Gly 
270 


Ala 


Glu 


Asp 


Gin 


Glu 


Pro 


Thr 


Tyr Cys 


Asn 


Met 


Giy 


His 


Leu 


Ser 


Ser 


His 


Leu 






275 










280 










285 








Pro 


Gly 
290 


Arg 


Gly 


Pro 


Glu 


Glu 
295 


Pro 


Thr 


Glu 


Tyr 


Ser 
300 


Thr 


He 


Ser 


Arg 



Pro 
305 



<210> 27 

<211> 915 

<212> DNA 

<213> Homo sapiens 

<400> 27 
atgcccctgc tgacactcta 
caaatcaccg gtccaacaac 
gtttacagat caggctggga 
gactgcaaga tccttgttaa 
tccatcaagg acaatcagaa 
actgatgctg acacttactg 
gttcaagtga ccattgaccc 
tttacagcac cagtcaccca 
ttggacaaca ggcacaagct 
ttgctgctgc ttttggtggc 
aaagcagccg ggatgtcccc 
gcagacctga ccctgcagct 
tctgcccagg ttgaccaggt 
gacatttcct atgcatctct 
atgggccacc tcagtagcca 
accatcagca ggcct 

<210> 28 
<211> 3258 
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cctgctcctc ttctggctct caggctactc cattgccact 60 

agtgaatggc ttggagcggg gctccttgac cgtgcagtgt 120 

gacctacttg aagtggtggt gtcgaggagc tatttggcgt 180 

aaccagtggg tcagagcagg aggtgaagag ggaccgggtg 240 

aaaccgcacg ttcactgtga ccatggagga tctcatgaaa 300 

gtgtggaatt gagaaaactg gaaatgacct tggggtcaca 360 

agcgtcgact cctgccccca ccacgcctac ttccactacg 420 

agaagaaact agcagctccc caactctgao cggccaccac 480 

cctgaagctc agtgtcctcc tgcccctcat cttcaccata 540 

cgcctcactc ttggcttgga ggatgatgaa gtaccagcag 600 

agagcaggta ctgcagcccc tggagggcga cctctgctat 660 

ggccggaacc tccccgcgaa aggctaccac gaagctttcc 720 

ggaagtggaa tatgtcacca tggcttcctt gccgaaggag 780 

gaccttgggt gctgaggatc aggaaccgac ctactgcaac 84 0 

cctccccggc aggggccctg aggagcccac ggaatacagc 900 

915 



BNSDOCID: s>NO, ^0100673A1J„> 



<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 

<222> (42) « . . (1625) 
<400> 28 

cacgcgtccg gccagttctt ggaggagact ctgcacaggg c atg gat cac tgt ggt 56 

Met Asp His Cys Gly 
1 5 

gcc ctt ttc ctg tgc ctg tgc ctt ctg act ttg cag aat gca aca aca 104 
Ala Leu Phe Leu Cys Leu Cys Leu Leu Thr Leu Gin Asn Ala Thr Thr 
10 15 20 

gag aca tgg gaa.gaa etc ctg age tac atg gag aat atg cag gtg tec 152 
Glu Thr Trp Glu Glu Leu Leu Ser Tyr Met Qlu Asn Met Gin Val Ser 
25 30 35 

agg ggc egg age tea gtt ttt tec tct cgt caa etc cac cag ctg gag 200 
Arg Gly Arg Ser Ser Val Phe Ser Ser Arg Gin Leu His Gin Leu Glu 
40 .45 50 

cag atg eta ctg aac ace age ttc cea ggc tac aac ctg acc ttg cag 248 
Gin Met Leu lieu Asn Thr Ser Phe Pro Gly Tyr Asn Leu Thr Leu Gin 
55 . 60 65 . 

aca ccc acc ate cag tct ctg gcc ttc aag ctg age tgt gac ttc tct 296 
Thr Pro Thr He Gin Ser Leu Ala Phe Lys Leu Ser Cys Asp Phe Ser 
70 75 80 85 

ggc etc teg ctg acc agt gcc act ctg aag egg gtg ccc cag gca gga 344 
Gly Leu Ser Leu Thr Ser Ala Thr Leu Lys Arg Val Pro Gin Ala Gly 
90 95 100 

ggt cag cat gcc egg ggt cag cac gcc atg cag ttc ccc gcc gag ctg 392 
Gly Gin His Ala Arg Gly Gin His Ala Met Gin Phe Pro Ala Glu Leu 
105 110 115 

ace egg gac gee tgc aag ace cgc ccc agg gag ctg egg etc ate tgt 440 
Thr Arg Asp Ala Cys Lys Thr Arg Pro Arg Glu Leu Arg Leu He Cys 
120 125 130 

ate tac ttc tec aac acc cac ttt ttc aag gat gaa aac aac tea tct 488 
He Tyr Phe Ser Asn Thr His Phe Phe Lys Asp Glu Asn Asn Ser Ser 
135 140 145 

ctg ctg aat aac tac gtc ctg ggg gcc cag ctg agt cat ggg cac gtg 536 
Leu Leu Asn Asn Tyr Val Leu Gly Ala Gin Leu Ser His Gly His Val 
150 155 160 165 

aac aac etc agg gat cet gtg. aac ate age ttc tgg cac aac caa age 584 
Asn Asn Leu Arg Asp Pro Val Asn He Ser Phe Trp His Asn Gin Ser 
170 175 180 

ctg gaa ggc tac acc ctg acc tgt gtc ttc tgg aag gag gga gcc agg 632 
Leu Glu Gly Tyr Thr Leu Thr Cys Val Phe Trp Lys Glu Gly Ala Arg 
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18$ 



190 



19S 



aaa cag ccc tgg ggg ggc tgg age cct gag ggc tgt cgt aca gag cag 680 
Lys Gin Pro Trp Gly Qly Trp Ser Pro Glu Gly Cys Arg Thr Glu Gin 

200 205 210 

CCC tec cac tct cag gtg etc tgc cgc tgc aac cac etc ace tac ttt 728 
Pro ser His Ser Gin val Leu Cys Arg Cys Asn His Leu Thr Tyr Phe 
215 220 22S 

get gtt etc atg caa etc tec cca gee ctg gtc cct gca gag ttg ctg 776 
Ala Val Leu Met Gin Leu Ser Pro Ala Leu Val Pro Ala Glu Leu Leu 
230 235 240 245 

gca cct ctt acg tac ate tec etc gtg ggc tgc age ate tec ate gtg 824 
Ala Pro Leu Thr Tyr lie Ser Leu Val Gly Cys Ser lie Ser lie Val 
250 255 260 

gcc teg ctg ate aca gtc ctg ctg cac ttc cat tte agg aag cag agt 872 
Ala Ser Leu lie Thr Val Leu Leu His Phe His Phe Arg Lys Gin Ser 
265 270 275 

gac tec tta aca cgc ate cac atg aac ctg cat gee tec gtg ctg etc 920 
Asp Ser Leu Thr Arg lie His Met Asn Leu His Ala Ser Val Leu Leu 
280 285 290 

ctg aac ate gee ttc ctg ctg age ccc gca ttc gca atg tct cct gtg 968 
Leu Asn lie Ala Phe Leu Leu Ser Pro Ala Phe Ala Met Ser Pro Val 
295 300 305 

ccc ggg tea gca tgc acg get ctg gee get gee ctg cac tac gcg ctg 1016 
Pro Gly Ser Ala Cys Thr Ala Leu Ala Ala Ala Leu His Tyr Ala Leu 
310 315 320 325 

etc age tgc etc ace tgg atg gee ate gag ggc ttc aac etc tac etc 1064 
Leu Ser Cys Leu Thr Trp Met Ala lie Glu Gly Phe Asn Leu Tyr Leu 
330 335 . 340 

etc etc ggg cgt gtc tac aac ate tac ate cgc aga tat gtg ttc aag 1112 
Leu Leu Gly Arg Val Tyr Asn lie Tyr He Arg Arg Tyr Val Phe Lys 
345 350 355 

Ctt ggt gtg eta ggc tgg ggg gee cca gee etc ctg gtg ctg ctt tec 1160 
Leu Gly Val Leu Gly Trp Gly Ala Pro Ala Leu Leu Val Leu Leu Ser 
360 365 370 

etc tct gtc aag age teg gta tac gga ccc tgc aca ate ccc gtc ttc 1208 
Leu Ser Val Lys Ser Ser Val Tyr Gly Pro Cys Thr He Pro Val Phe 
375 380 385 

gac age tgg gag aat ggc aca ggc tte cag aac atg tec ata tgc tgg 1256 
Asp Ser Trp Glu Asn Gly Thr Gly Phe Gin Asn Met Ser He Cys Trp 
390 395 400 405 

gtg egg age ccc gtg gtg cac agt gtc ctg gtc atg ggc tac ggc ggc 13 04 

Val Arg Ser Pro Val Val His Ser Val Leu Val Met Gly Tyr Gly Gly 
410 415 420 
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etc acg tec etc ttc aac ctg gtg gtg ctg gcc tgg gcg ctg tgg acc 1352 
Iie\i Thr Ser Leu Phe Asn Leu Val Val Leu Ala Trp Ala Leu Trp Thr 
425 430 435 

ctg cgc agg ctg egg gag egg gcg gat gca cca agt gtc agg gcc tgc 1400 
Leu Arg Arg Leu Arg Glu Arg Ala Asp Ala Pro ser Val Arg Ala Cys 
440 . 445 450 

cat gac act gtc act gtg ctg ggc etc acc gtg ctg ctg gga acc acc 1448 
His Asp Thr Val Thr Val Leu Gly Leu Thr Val Leu Leu Gly Thr Thr 
455 460 465 

tgg gee ttg gee ttc ttt tct ttt ggc gtc tte ctg ctg cce cag ctg 1496 
Trp Ala Leu Ala Phe Phe Ser Phe Gly Val Phe Leu Leu Pro Gin Leu 
470 475 480 485 

ttc etc ttc acc ate tta aac teg etc tac ggt ttc ttc ett ttc ctg 1544 
Phe Leu Phe Thr lie Leu Asn Ser Leu Tyr Gly Phe Phe Leu Phe Leu 
490 495 500 

tgg ttc tgc tec cag egg tgc cgc tea gaa gca gag gcc aag gca cag 1592 
Trp Phe cys Ser Gin Arg Cys Arg Ser Glu Ala Glu Ala Lys Ala Gin 
505 510 515 

ata gag gee ttc age tec tec caa aca aca cag tagtccgggc ctectggcet 1645 
lie Glu Ala Phe Ser Ser Ser Gin Thr Thr Gin 
520 525 

ggaatcctca gcctctctgg ecgccagtag cctgaggcta cggctcctgc tagagagggt 1705 

ggcaggcetg ctgctggacc ccagaggcca ctgtgaccgc caaggggcct tttccacttc 1765 

cacggcctct ecaggcactg aggggaaggc attgetctac ctetccctga cattttgctc 1825 

cggggcagat ecaaccttac ctggggcagc aaactttgtc ctggtacctg ggeccagctc 1885 

gccagggatg tgggcagagc accagcctgg gcatcaggaa gccaagtttc aaggactgtc 1945 

tttgagtctg tctgtatgac cttgggcctg ccacttctca cagaccctag gtatccacag 2005 

ctgtgacatg ggggcaagcg gctttgtttc agcctaaccc aggagcttag taaaaattgc 2065 

ataagaccag ggggaagagt gtcagcgtgg ggtgggaatt cccgcggcct ccacctgctt 2125 

gctaggggca ggatctcatt caggctgccc tggaagcacc tgcttggccc tgccaccttc 2185 

ctccagggga gggccagatg gcatcctggc ttggggcggg tgggacctac ccaggctctg 224 5 

agactttact ggcctatgcc tgaggcctct tttcctttaa ctccctaaat tatgatgact 2305 

ccaagtccaa gcccaccctt cccaaagatt gggaggttcc gccgttccca gaggctcctc 23 65 

ctgcggtgct cccaagactt ccatagacca tctggaccag tagcccatcc cgcagttttc 2425 

ttgggggcag aggaaaacgc ttctttctcc tccagetgaa tcagctggat cccagtgtec 2485 

tggctgtttg gtgattgggc aagattgaat ttgcccaggt aggcgtgaga gtgtgggttt 2545 

taaattegaa getcaggcea tagtttcaga gaateaccct taccccagac cttcatgaga 2605 

cagtgctcat gaagccagtg cgtttcccag aacgaacact aggcggcacc gttggtccac 2665 

actcagaggc eettggcgcc aagactgeat etagaatcgc tcaaacacct gtttgcagac 2725 

cccatgcacc agctggaggg gccgtaactg caggactgcg cctactgagt gacccatttc 2785 

ctccaggagg aaaggeaaga cacgcttaca cggccatttg tctcttttcc caatgcggcg 2845 

gtgcactttc gctcttgggg gctgcacccc agacatagct ggcaccagag cagggtgctc 2905 

aggtggtggg tgctcagggc cctgccccag gccactgggc cgttttgatg acctcgaagg 2965 

tcacaggcag aaaataggag eaggatttcc cctggggaaa agttetcctg ggacatcttc 3025 

tgctettetg tacatttcta gatgeaaata actccttcac caggcagtga gtggegtagg 3085 

ctctggagcc aggctgcctg ggctccaatg ccagctctgc cacttgetag ctgtgagact 3145 

gtggacaaac cactcagcct ctgtgtgcct cagttttect atttgtaaaa tagaggceat 3205 

agtggtacct attttgaaga ctaagtaaaa gaattcaaat aaagagactt ggc 3258 

<210> 29 
<211> 528 
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<212> PRT 

<213> Homo sapiens 



<400> 29 



Met 


Asp 


His 


Cys 


Gly 


Ala 


Leu 


Phe 


Leu 


Cys 


Leu 


Cys Leu 


Leu 


Thr 


Leu 


1 








5 










10 








IS 




Gin 


Asn 


Ala 


Thr 


Thr 


Glu 


Thr 


Trp 


Glu 


Glu 


Leu 


Leu Ser 


Tyr 


Met 


Glu 








20 










25 








30 






Asn 


Met 


Gin 


Val 


Ser 


Arg 


Gly 


Arg 


Ser 


Ser 


Val 


Phe Ser 


Ser 


Arg 


Gin 






35 










40 








45 








Leu 


His 


Gin 


Leu 


Glu 


Gin 


Met 


Leu 


Leu 


Asn 


Thr 


Ser Phe 


Pro 


Gly 


Tyr 




50 










55 










60 








Asn 


Leu 


Thr 


Leu 


Gin 


Thr 


Pro 


Thr 


He 


Gin 


Ser 


Leu Ala 


Phe 


LVS 


Leu 


6S 










70 










75 








80 


Sqv 


Cys 


Asp 


Phe 


Ser 


Gly 


Leu 


Ser 


Leu 


Thr 


Ser 


Ala Thr 


Leu 


Lys 


Arg 










85 










90 








95 




val 


Pro 


Gin 


Ala 


Glv 


Glv 


Gin 


His 


Ala 


Ara 


Gly Gin His 


Ala 


Met 


Gin 








100 










105 








110 






Phe 


Pro 


Ala 


Glu 


Leu 


Thr 


y 


Asp 


Ala 




Lys 


Thr Arg 


Pro 




Glu 






115 










120 








125 








Leu 


Arg 


Leu 


He 


Cys 


He 


Tvr 


Phe 


Ser 


Asn 


Thr 


His Phe 


Phe 


Lys 


Asp 




130 










135 










140 








Glu 


Asn 


Asn 


Ser 


Ser 


Leu 


Leu 


Asn 


Asn 


Tvr 


Val Leu Gly 


Ala 


Gin 


Leu 


145 










150 










155 








160 


Ser 


His 


Gly 


His 


Val 


Asn 


Asn 


Leu 


Arg 


Asp 


Pro 


Val Asn 


lie 


Ser 


Phe 




























175 






His 


Asn 


Gin 


Oft? JL 


Leu 


Glu 


Gly 


lyt 


Thr 


Leu Thr Cys 


Val 


Phe 










J.OU 


























Glu Gly 


Ala 




Lys 


Gin 


Pro 


Trp 


Gly 


Gly Trp Ser 


Pro 


Glu 


Glv 






195 


















205 










Arg 


Thr 


Glu 


Gin 


Pro 


Ser 


His 


Ser 


Gin 


Val 


Leu Cys 




v-y » 


Asn 




210 










215 










220 








His 


Leu 


Thr 


*V\fr 
xyi. 


Phe 


Ala 


Val 


Leu 


Met 


Gin 


Leu 


Ser Pro 


Ala 


Leu 


Val 


225 










230 










235 








240 


PtO 


Ala 


Glu 


Leu 


Leu 


Ala 


Pro 


Leu 


Thr 




lie 


Ser Leu 


Val 


Gly 


Cva 
































SeiT 


lie 


Ser 


He 


Val 


Ala 


Ser - 


Leu 


He 


Thr 


Val 


Leu Leu 


His 


Phe 


His 


































Arg 


Lys 


m n 


ser 


Asp 


S r 


Leu 




Arg 


He 


His Met 


Asn 


uc u 


H4 s 






275 


















285 








Ala 


Ser 


val 


Leu 


Leu 


Leu 


Asn 


He 


Ala 


Phe 


Leu 


Leu Ser 


Pro 


Ala 


Phe 




290 










295 










300 








Ala 


Met 


Ser 


Pro 


Val 


Pro 


Gly 


Ser 


Ala 


Cys 


Thr 


Ala Leu 


Ala 


Ala 


Ala 


305 










310 










315 








320 


Leu 


His 


Tyr 


Ala 


Leu 


Leu 


Ser 


Cys 


Leu 


Thr 


Trp Met Ala 


He 


Glu 


Gly 










325 










330 








335 




Phe 


Asn 


Leu 


Tyr 


Leu 


Leu 


Leu 


Gly 


Arg 


Val 


Tyr Asn He 


Tyr 


He 


Arg 








340 










345 








350 






Arg 


Tyr val 


Phe 


Lys 


Leu 


Gly 


Val 


Leu 


Gly 


Trp Gly Ala 


Pro 


Ala 


Leu 






355 










360 








365 








Leu 


Val 


Leu 


Leu 


Ser 


Leu 


Ser 


Val 


Lys 


Ser 


Ser 


Val Tyr 


Gly 


Pro 


Cys 




370 










375 










380 








Thr 


He 


Pro 


Val 


Phe 


Asp 


Ser 


Trp 


Glu 


Asn 


Gly Thr Gly 


Phe 


Gin 


Asn 


385 










390 










395 








400 


Met 


ser 


He 


Cys 


Trp 


Val 


Arg 


Ser 


Pro 


Val 


Val 


His Ser 


Val 


Leu 


Val 










405 










410 








415 




Met 


Gly Tyr 


Gly 


Gly 


Leu 


Thr 


Ser 


Leu 


Phe 


Asn 


Leu val 


Val 


Leu 


Ala 








420 










425 








430 







40 
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Trp 


Ala 


Leu 


Trp 


Thr 


Leu 


Arg Arg 


Leu 


Arg 


Glu 


Arg 


Ala 


Asp 


Ala 


Pro 






435 










440 










445 








Ser 


Val 
450 


Arg 


Ala 


Cys 


His 


Asp 
455 


Thr 


Val 


Thr 


Val 


Leu 
460 


Gly 


Leu 


Thr 


Val 


Leu 


Leu 


Gly 


Thr 


Thr 


Trp Ala 


Leu 


Ala 


Phe 


Phe 


Ser 


Phe 


Gly 


Val 


Phe 


465 










470 










475 










480 


Leu 


Leu 


Pro 


Gin 


Leu 
485 


Phe 


Leu 


Phe 


Thr 


He 
490 


Leu 


Asn 


Ser 


Leu 


Tyr 
495 


Gly 


Phe 


Phe 


Leu 


Phe 
500 


Leu 


Trp 


Phe 


Cys 


Ser 
505 


Gin 


Arg 


Cys 


Arg 


Ser 
510 


Glu 


Ala 


Glu 


Ala 


Lys 
515 


Ala 


Gin 


He 


Glu 


Ala 
520 


Phe 


Ser 


Ser 


Ser 


Gin 
525 


Thr 


Thr 


Gin 



<210> 30 

<211> 1584 

<212> DNA 

<213> Homo sapiens 

<;400> 30 

atggatcact gtggtgccct tttcctgtgc ctgtgccttc tgactttgca gaatgcaaca 60 

acagagacat gggaagaact cctgagctac atggagaata tgcaggtgtc caggggccgg 120 

agctcagttt tttcctctcg tcaactccac cagctggagc agatgctact gaacaccagc 180 

ttcccaggct acaacctgac cttgcagaca cccaccatcc agtctctggc cttcaagctg 240 

agctgtgact tctctggcct ctcgctgacc agtgccactc tgaagcgggt gccccaggca 300 

ggaggtcagc atgcccgggg tcagcacgcc atgcagttcc ccgccgagct gacccgggac 360 

gcctgcaaga cccgccccag ggagctgcgg ctcatctgta tctacttctc caacacccac 420 

tttttcaagg atgaaaacaa ctcatctctg ctgaataacc acgtcctggg ggcccagctg 480 

agtcatgggc acgtgaacaa cctcagggat cctgtgaaca tcagcttctg gcacaaccaa 54 0 

agcctggaag gctacaccct gacctgtgtc ttctggaagg agggagccag gaaacagccc 600 

tgggggggct ggagccctga gggctgtcgt acagagcagc cctcccactc tcaggtgctc 660 

tgccgctgca accacctcac ctactttgct gttctcatgc aactctcccc agccctggtc 720 

cctgcagagt tgctggcacc tcttacgtac atctccctcg tgggctgcag catctccatc 780 

gtggcctcgc tgatcacagt cctgctgcac ttccatttca ggaagcagag tgactcctta 840 

acacgcatcc acatgaacct gcatgcctcc gtgctgctcc tgaacatcgc cttcctgctg 900 

agccccgcat tcgcaatgtc tcctgtgccc gggtcagcat gcacggctct ggccgctgcc 960 

ctgcactacg cgctgctcag ctgcctcacc tggatggcca tcgagggctt caacctctac 1020 

ctcctcctcg ggcgtgtcta caacatctac atccgcagat atgtgttcaa gcttggtgtg 1080 

ctaggctggg gggccccagc cctcctggtg ctgctttccc tctctgtcaa gagctcggta 114 0 

tacggaccct gcacaatccc cgtcttcgac agctgggaga atggcacagg cttccagaac 1200 

atgtccatat gctgggtgcg gagccccgtg gtgcacagtg tcctggtcat gggctacggc 1260 

ggcctcacgt ccctcttcaa cctggtggtg ctggcctggg cgctgtggac cctgcgcagg 1320 

ctgcgggagc gggcggatgc accaagtgtc agggcctgcc atgacactgt cactgtgctg 1380 

ggcctcaccg tgctgctggg aaccacctgg gccttggcct tcttttcttt tggcgtcttc 1440 

ctgctgcccc agctgttcct cttcaccatc ttaaactcgc tctacggttt cttccttttc ISOO 

ctgtggttct gctcccagcg gtgccgctca gaagcagagg ccaaggcaca gatagaggcc 1560 

ttcagctcct cccaaacaac acag 1584 



<210> 31 
<211> 63 

<212> PRT 

<213> Homo sapiens 

<400> 31 

Leu Lys Ser Pro Glu Gly Lys Ser Arg Lys Asn Pro Ala Arg Thr Cys 

15 10 IS 

Lys Asp Leu Phe Leu Cys His Pro Glu Phe Lys Ser Gly Glu Tyr Trp 

20 25 30 

lie Asp Pro Asn Gin Gly Cys He Lys Asp Ala He Lys Val Phe cys 
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35 

Asn Lys Arg 
SO 



40 45 
Phe Glu Thr Gly Val Gly Glu Thr Cys lie Ser Pro 
55 60 



<210> 32 

<211> 25 

<212> PRT 

<213> Homo sapiens 

<4D0> 32 

lie Ser Asn Val Gin Thr Phe Leu Arg Leu Leu Ser Thr Glu Ala Ser 

1 5 10 15 

Gin Asn He Thr Tyr His Cys Lys Asn 
20 25 



<210> 33 

<211> 33 

<212> PRT 

<213> Homo sapiens 



<400> 33 
Thr Val Leu Gly Glu Asp Gly Cys 

1 S 
Lys Thr Val lie Glu Tyr Glu Thr 
20 

Val 



Ser Ser Arg Thr Gly Glu Trp Gly 

10 IS 
Lys Lys Thr Thr Arg Leu Pro He 
25 30 



<210> 34 

<211> 65 

<212> PRT 

<213> Homo sapiens 





<400> 


34 








He 


Asn 


Thr 


He 


Lys 


Asn Pro 


Leu 


1 








5 






lie 


Cys 


Lys 


Asp 


Leu 


Leu Asn 


Cys 








20 








Tyr 


Trp 


He 


Asp 


Pro 


Asn Leu Gly 






35 








40 


Phe 


He 


Asn 


Thr 


Cys 


Asn Phe 


Ser 




50 








55 




Pro 














65 















Gly Thr Arg Asp Asn Pro Ala Arg 

10 15 
Glu Gin Lys Val Ser Asp Gly Lys 

25 30 
Cys Pro Ser Asp Ala He Glu val 
45 

Ala Gly Gly Gin Thr Cys Leu Pro 
60 



<210> 35 

<211> 26 

<212> PRT 

<213> Homo sapiens 



<400> 35 
Val Gly Lys Val Gin Met Asn Phe 

1 5 
Thr His He He Thr He His Cys 
20 



Leu His Leu Leu Ser Ser Glu Ala 

10 IS 
Leu Asn 
25 



<210> 36 
<211> 32 



42 
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<212> PRT. 

<213> Homo sapiens 

<400> 36 
Lys Val Leu Ser Asp Asp Cys Lys 

1 5 
Ala Thr Phe Leu Phe His Thr Gin 
20 

<210> 37 
<211> 31 
<212> PRT 

<213> Homo sapiens 



lie Gin Asp Gly Ser Trp His Lys 

10 15 
Glu Pro Asn Gin Leu Pro Val lie 
25 30 



<400> 37 

Gly Glu Ser Val Thr Leu Thr Cys 

1 5 
Pro Val Thr Trp Leu Arg Asn Gly 
20 



Ser Val Ser Gly Phe Gly Pro Pro 

10 15 

Lys Leu Ser Leu Thr He Ser 

25 30 



<210> 38 

<211> 57 

<212> PRT 

<213> Homo sapiens 



<400> 38 
Gly Arg Thr Val 
1 

Thr Met Trp Thr 
20 

Phe Arg Val Leu 
35 

Asp Ala Gly Val 
50 



Arg Leu Gin Cys 
5 

Lys Asp Gly Arg 

Pro Gin Gly Leu 
40 

Tyr Val Cys Lys 
55 



Pro Val Glu Gly 
10 

Thr He His Ser 
25 

Lys Val Lys Gin 
Ala 



Asp Pro Pro Pro 
15 

Gly Trp Ser Arg 
30 

Val Glu Arg Glu 
45 



<210> 39 

<211> 59 

<212> PRT 

<213> Homo sapiens 



<400> 39 
Gly Ser Ser Val Arg Leu Lys Cys 

1 5 
Asp He Thr Trp Met Lys Asp Asp 
20 

Ala Glu Pro Arg Lys Lys Lys Trp 

35 40 
Pro Glu Asp Ser Gly Lys Tyr Thr 
50 55 



Val Ala Ser Gly His Pro Arg Pro 

10 15 
Gin Ala Leu Thr Arg Pro Glu Ala 
25 30 
Thr Leu Ser Leu Lys Asn Leu Arg 
45 

Cys Arg Val 



<210> 40 

<211> 79 

<212> PRT 

<213> Homo sapiens 

<400> 40 

Gly Gly Thr Thr Ser Phe Gin Cys Lys Val Arg Ser Asp Val Lys Pro 
1 5 10 15 
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Val 


He Gin 


Trp 


Leu 


Lys Arg Val Glu Tyr Gly Ala Glu Gly Arg His 






20 




25 




30 




Asn 


Ser Thr 


lie 


Asp 


Val Gly Gly Gin Lys Phe 


Val Val 


Leu Pro 


Thr 




35 






40 


45 






Gly 


Asp Val 


Trp 


Ser 


Arg Pro Asp Gly Ser Tyr Asn Lys 


Leu Leu 


He 




SO 






55 


60 






Thr 


Arg Ala 


Arg 


Gin 


Asp Asp Ala Gly Met Tyr 


He Cys 


Leu Gly 
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70 75 










<210> 


41 














<211> 


78 














<212> 


PRT 














<213> 


Homo sapiens 










<400> 


41 












Arg 


Gly Ser 


Leu 


Thr 


Val Gin Cys Val Tyr Arg 


Ser Gly Trp Glu 


Thr 


1 






5 


10 




IS 




Tyr 


Leu Lys 


Trp 


Trp 


Cys Arg Gly Ala He. Trp 


Arg Asp 


Cys Lys 


He 






20 




25 




30 




Leu 


Val Lys 


Thr 


Ser 


Gly Ser Glu Gin Glu Val 


Lys Arg 


Asp Arg 


Val 




35 






40 


45 






Ser 


He Lys 


Asp 


Asn 


Gin Lys Asn Arg Thr Phe 


Thr Val 


Thr Met 


Glu 




50 






55 


60 






Asp 


Leu Met 


Lys 


Thr 


Asp Ala Asp Thr Tyr Trp 


Cys Gly 


He 




65 








70 75 









<210> 42 
<211> 10 
<212> PRT 

<213> Homo sapiens 
<400> 42 

val Phe Val Leu Gly Thr Leu Gly He Phe 
15 10 

<210> 43 
<211> 10 
<212> PRT 

<213> Homo sapiens 

<400> 43 

Val Phe He Leu Gly Thr Leu Leu Leu Trp 
15 10 

<210> 44 

<211> 116 

<212> PRT 

<213> Homo sapiens 

<400> 44 



Cys 


Gly Gly 


Thr 


Leu 


Asp Leu Thr Glu Ser Ser 


Gly 


Ser 


He 


Ser 


Ser 


1 






5 


10 








15 




Pro 


Asn Tyr 


Pro 


Asn 


Arg Ser Asp Tyr Pro Pro 


Asn 


Lys 


Glu 


Cys 


Val 






20 




25 






30 






Trp 


Arg He 


Arg 


Ala 


Pro Pro Gly Tyr Arg Val 


Val 


Glu 


Leu 


Thr 


Phe 




35 






40 




45 








Gin 


Asp Phe 


Asp 


Leu 


Glu Asp His Asp Gly Ala 


Pro 


Cys 


Arg 


Tyr 


Asp 




50 






55 


60 











44 



Tyr 


Val Glu 


He 


Arg 


Asp Gly Asp 


Pro Ser Ser 


Pro Leu 


Leu Gly Arg 


65 








70 




75 






80 


Phe 


Cys Gly Ser 


Gly 


Lys Pro 


Glu 


Asp He Arg 


Ser Thr 


Ser 


Asn Arg 








85 






90 






95 


Met 


Leu lie 


Lys 


Phe 


val Ser Asp 


Ala Ser Val 


Ser Lys 


Arg Gly Phe 






100 








105 




110 




Lys 


Ala Thr 


Tyr 


















115 




















<210> 


45 


















<211> 


97 


















<212> 


PRT 


















<213> 


Homo sapiens 














<400> 


45 
















Gly 


Ser Val 


Leu 


Leu 


Ala Gin 


Glu 


Leu Pro Gin 


Gin Leu 


Thr 


Ser Pro 


1 






5 






10 






15 


Gly 


Tyr Pro 


Glu 


Pro 


Tyr Gly Lys 


Gly Gin Glu 


Ser Ser 


Thr Asp He 






20 








25 




30 




Lys 


Ala Pro 


Glu 


Gly 


Phe Ala 


Val 


Arg Leu Val 


Phe Gin 


Asp 


Phe Asp 




35 








40 




45 






Leu 


Glu pro 


Ser 


Gin 


Asp Cys 


Ala 


Gly Asp Ser 


Val Thr 


Val 


Ser Trp 




50 






55 






60 






Gly 


Trp Gly Gly 


Ser 


Arg Gin 


Asp 


Cys Gly Gin 


Gly Asp 


Ser 


Arg Gly 










70 




75 






80 


Cys 


Gly Lys 


Trp 


Arg 


Cys Pro 


Glu 


Ser Pro He 


Trp Arg 


Arg 


Asp Glu 








85 






90 






95 


Phe 






















<210> 


46 


















<211> 


45 


















<212> 


PRT 


















<213> Homo sapiens 














<400> 


46 
















Cys 


Ala Pro 


Asn 


Asn 


Pro Cys 


Ser Asn Gly Gly Thr Cys 


Val 


Asn Thr 


1 






5 






10 






15 


Pro 


Gly Gly Ser 


Ser 


Asp Asn 


Phe 


Gly Gly Tyr Thr Cys 


Glu Cys Pro 






20 








25 




30 




Pro 


Gly Asp Tyr 


Tyr 


Leu Ser 


TyX 


Thr Gly Lys Arg Cys 








35 








40 




45 








<210> 


47 


















<211> 


67 


















<212> 


'PRT 


















<213> 


Homo sapiens 














<400> 


47 
















Trp 


Ser Thr Asp 


Lys 


His He Gly Gly Arg Thr Ser Leu 


Gly 


Phe Asn 


1 






5 






10 






15 


Leu 


Glu Tyr 


Arg 


He 


Arg Val 


Thr 


Cys Asp Glu Asn Tyr 


Tyr 


Gly Glu 






20 








25 




30 




Gly 


Cys Asn 


Lys 


Phe 


Cys Arg 


Pro 


Arg Asp Asp Ala Phe 


Gly His Tyr 




35 








40 




45 






Thr 


Cys Asp 


Glu 


Asn 


Gly Asn 


Lys 


Leu Cys Leu Glu Gly 


Trp 


Lys Gly 




50 






55 






60 






Glu 


Tyr cys 



















45 
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65 



<210> 


48 
















<211> 


59 
















<212> 


PRT 
















<213> 


Homo sapiens 














<400> 


48 
















Cys. Asp Cys 


Asn Pro His 


Gly 


Ser 


Leu Ser 


Asp 


Asp 


Thr Cys Asp Ser 


1 


5 






10 








15 


Asp Asp Glu 


Leu Phe Gly 


Glu 


Glu 


Thr Gly 


Gin 


Cys 


Leu 


Lys Cys Lys 




20 






25 








30 


Pro Asn Val 


Thr Gly Arg 


Arg 


Cys 


Asp Arg 


Cys 


Lys 


Pro 


Gly Tyr Tyr 


35 






40 








45 




Gly Leu Pro 


Ser Gly Asp 


Pro 


Gin 


Gin Gly Cys 








60 




55 















<210> 49 
<211> 31 
<212> PRT 

<213> Homo sapiens 
<400> 49 

Cys Val Pro Leu Cys Ala Gin Glu Cys Val His Gly Arg Cys Val Ala 

1 5 .10 15 

Pro Asn Gin Cys Gin Cys Val Pro Gly Trp Arg Gly Asp Asp Cys 
20 25 30 

<210> 50 
<211> 30 
<212> PRT 

<213> Homo sapiens 
<400> 50 

Cys Gin Phe Arg Cys Gin Cys His Gly Ala Pro Cys Asp Pro Gin Thr 

15 10 15 

Gly Ala Cys Phe Cys Pro Ala Glu Arg Thr Gly Pro Ser Cys 
20 25 30 

<210> 51 
<211> 31 
<212> PRT 

<213> Homo sapiens 
<400> 51 

Cys Pro Ser Thr His Pro Cys Gin Asn Gly Gly Val Phe Gin Thr Pro 

15 10 15 

Gin Gly Ser Cys Ser Cys Pro Pro Gly Trp Met Gly Thr He Cys 
20 25 30 

<210> 52 

<211> 31 
<212> PRT 

<213> Homo sapiens 

<400> 52 

cys Ser Gin Glu Cys Arg Cys His Asn Gly Gly Leu Cys Asp Arg Phe 
1 S 10-15 
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Thr Gly C31n Cys Arg Gys Ala Pro Gly Tyr Thr Gly Asp Arg Cys 
20 25 30 





<210> 


53 






<211> 


31 






<212> 


PRT 






<213> 


Homo sapiens 






<400> 


53 




Cys 


kla Glu 


Thr Cys Asp Cys Ala 


Pro Asp Ala Arg Cys Phe Pro 


1 




5 


10 15 


Ash 


Gly Ala 


Cys Leu Cys Glu His 


Gly Phe Thr Gly Asp Arg Cys 






20 


25 30 




<210> 


54 






<211> 


27 






<212> 


PRT 






<213> 


Homo sapiens 





<400> 54 

Cys Asp Arg Glu His Ser Leu Ser Cys His Pro Met Asn Gly Glu Cys 

1 5 .10 15 

Ser Cys Leu Pro Gly Trp Ala Gly Leu His Cys 
20 25 



<210> 55 

<211> 31 

<212> PRT 

<213> Homo sapiens 

<400> 55 
Cys Gin Glu His Cys Leu Cys Leu 

1 5 
Ser Gly Leu Cys Gin Cys Ala Pro 
20 



His Gly Gly Val Cys Gin Ala Thr 

10 15 
Gly Tyr Thr Gly Pro His Cys 
25 30 



<210> 56 

<211> 31 

<212> PRT 

<213> Homo sapiens 



<400> 56 
Cys Ser Ala Arg Cys Ser Cys Glu 

1 5 
Asp Gly Glu Cys Val Cys Lys Glu 
20 

<210> 57 

<211> 31 

<212> PRT 

<213> Homo sapiens 



Asn Ala lie Ala Cys Ser Pro lie 

10 15 
Gly Trp Gin Arg Gly Asn Cys 
25 30 



<400> 57 
Cys Asn Ala ser Cys Gin Cys Ala 

1 S 
Thr Gly Ala Cys Thr Cys Thr Pro 
20 



His Glu Ala Val Cys Ser Pro Gin 

10 15 
Gly Trp His Gly Ala His Cys 
25 30 
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<210> 


58 














<211> 


31 














<212> 


PRT 














<213> 


Homo sapiens 














<400> 


58 












Cys 


Ala Ser 


Arg Cys Asp Cys 


Asp 


His 


Ser 


Asp 


Gly Cys Asp Pro 


1 




5 






10 




15 


His 


Gly Arg 


Cys Gin Cys Gin 


Ala 


Gly 


Trp 


Met 


Gly Ala Arg Cys 






20 




25 






30 



<210> 59 

<211> 31 

<212> PRT 

<213> Homo sapiens 



<400> 59 
Cys Ser Asn Thr Cys Thr Cys Lys 

1 5 
Asn Gly Asn Cys Val Cys Ala Pro 
20 

<210> 60 

<211> 30 

<212> PRT 

<213> Homo sapiens 



Asn Gly Gly Thr Cys Leu Pro Glu 

10 15 
Gly Phe Arg Gly Pro Ser Cys 
25 30 



<400> 60 

Cys Val Pro Cys Lys Cys Ala Asn 

1 5 

Gly Thr Cys Tyr Cys Leu Ala Gly 
20 

<210> 61 

<211> 31 

<2.12> PRT 

<213> Homo sapiens 



His Ser Phe Cys His Pro Ser Asn 

10 15 

Trp Thr Gly Pro Asp Cys 
25 30 



<400> 61 

Cys Ala Gin Thr Cys Gin Cys His 

1 5 

Asp Gly ser Cys He Cys Pro Leu 
20 

<210> 62 

<211> 31 

<212> PRT 

<213> Homo sapiens 



His Gly Gly Thr Cys His Pro Gin 

10 15 
Gly Trp Thr Gly His His Cys 
25 30 



<400> 62 

Cys Ser Gin Pro Cys Gin Cys Gly Pro Gly Glu Lys Cys His Pro Glu 

1 5 10 15 

Thr Gly Ala Cys Val Cys Pro Pro Gly His Ser Gly Ala Pro Cys 
20 25 30 



<210> 63 
<211> 37 
<212> PRT 
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<213> Homo sapiens 
<400> 63 

Gin Thr Gly Ala Cys Thr Cys Thr Pro Gly Trp His Gly Ala. His Cys 

1 5 10 15 

Gin Leu Pro Cys Pro Lys Gly Gin Phe Gly Glu Gly Cys Ala Ser Arg 

20 25 30 

Cys Asp Cys Asp His 
35 



<210> 64 
<211> 31 
<212> PRT 

<213> Mus musculUS 
<400> 64 

Gys Ser Asn Thr Cys Thr Cys Lys Asn Gly Gly Thr Cys Val Ser Glu 

1 5 10 15 

Asn Gly Asn Cys Val Cys Ala Pro Gly Phe Arg Gly Pro Ser Cys 
20 25 30 

<210> 65 

<211> 31 

<212> PRT 

<213> Mus musculus 

<400> 65 

Cys Val Gin Cys Lys Cys Asn Asn Asn His Ser Ser Cys His Pro Ser 

1 5 10 15 

Asp Gly Thr Cys Ser Cys Leu Ala Gly Trp Thr Gly Pro Asp Cys 
20 25 30 

<210> 66 
<211> 31 
<212> PRT 

<213> Mus musculus 



<400> 66 
Cys Ser Gin Leu Cys Qln Cys His 

1 5 
Asp Gly Ser Cys He Cys Thr Pro 
20 



His Gly Gly Thr Cys His Pro Gin 

10 15 
Gly Trp Thr Gly Pro Asn Cys 
25 30 



<210> 67 
<211> 31 
<212> PRT 

<213> Mus musculus 



<400> 67 
Cys Ser Gin Leu Cys Gin Cys Asp 

1 5 
Thr Gly Ala Cys Val Cys Pro Pro 
20 



Leu Qly Glu Met Cys His Pro Glu 

10 IS 

Gly His Ser Gly Ala Asp Cys 
25 30 



<210> 68 
<211> 35 
<212> PRT 

<213> Mus musculus 
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<400> 68 
His Ala Ser Gly Asp Pro Val His 

1 5 
Trp Met Gly Thr Arg Cys His Leu 
20 

Ala Asn Cys 
35 



Gly Gin Cys Arg Cys Gin Ala Gly 

10 15 
Pro Cys Pro Glu Gly Phe Trp Gly 
25 30 



<210> 69 

<211> 40 

<212> PRT 

<213> Mus mus cuius 



<400> 69 

Cys Thr Cys Lys Asn Gly Gly Thr Cys Val Ser Glu Asn Gly Asn Cys 

1 5 10 15 

Val Cys Ala Pro Gly Phe Arg Gly Pro Ser Cys Gin Arg Pro Cys Pro 

20 25 30 

Pro Gly Arg Tyr Gly Lys Arg Cys 
35 40 

<210> 70 

<211> 35 

<212> PRT 

<213> Mus musculus 



<400> 70 
Cys Lys Cys Asn Asn Asn His Ser 

1 5 
Cys Ser Cys Leu Ala Gly Trp Thr 
20 

Pro Pro Gly 
35 

<210> 71 
<211> 34 
<212> PRT 

<213> Mus musculus 



Ser Cys His Pro Ser Asp Gly Thr 

10 15 
Gly Pro Asp Cys Ser Glu Ala Cys 
25 30 



<400> 71 
Cys Gin Cys His His Gly Gly Thr 

1 5 
lie Cys Thr Pro Gly Trp Thr Gly 
20 

Pro Arg 



Cys His Pro Gin Asp Gly Ser Cys 

10 15 
Pro Asn Cys Leu Glu Gly Cys Pro 
25 30 



<210> 72 

<211> 58 

<212> PRT 

<213> Mus musculus 



<400> 72 

His Gly Gin Cys Arg Cys Gin Ala 

1 5 
Leu Pro Cys Pro Glu Gly Phe Trp 
20 

Thr Cys Lys Asn Gly Gly Thr Cys 



Gly Trp Met Gly Thr Arg Cys His 

10 15 
Gly Ala Asn Cys Ser Asn Thr Cys 
25 30 
Val Ser Glu Asn Gly Asn Cys Val 



50 



35 

Cys Ala Pro 
50 



. 40 

Qly Phe Arg Gly Pro Ser Cys 
55 



45 



<210> 73 

<211> 28 

<212> PRT 

<213> RattUB Sp, 

<400> 73 

Glu Cys Arg Cys His Asn Gly Gly Leu Cys Asp Arg Phe Thr Gly Gin 

15 10 15 

Cys His Cys Ala Pro Gly Tyr He Gly Asp Arg Cys 
20 25 

<210> 74 
<211> 31 
<212> PRT 
<213> Rattus sp* 

<400> 74 

Cys Ala Glu Thr Cys Asp Cys Ala Pro Gly Ala Arg Cys Phe Pro Ala 

1 5 10 15 

Asn Gly Ala Cys Leu Cys Glu His Gly Phe Thr Gly Asp Arg Cys 
20 25 30 

<210> 75 . 

<211> 33 

<212> PRT 

<213> Rattus Sp. 

<400> 75 

Cys Gin Asp Pro Cys Thr Cys Asp Pro Glu His Ser Leu Ser Cys His 

1 5 10 15 

Pro Met His Gly Glu Cys Ser Cys Gin Pro Gly Trp Ala Gly Leu His 
20 25 30 

Cys 



<210> 76 

<211> 31 

<212> PRT 

<213> Rattus sp* 

<400> 76 

Cys Gin Glu His Cys Leu Cys Leu His Qly Gly Val Cys Leu Ala Asp 

1 5 10 15 

Ser Gly Leu Cys Arg Cys Ala Pro Gly Tyr Thr Gly Pro His Cys 
20 25 30 

<210> 77 

<211> 31 

<212> PRT 

<213> Rattus sp. 

<400> 77 

Cys Ser Ser His Cys Ser Cys Glu Asn Ala He Ala Cys Ser Pro Val 
1 5 10 IS 
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Asp Gly Thr Cys lie Cys Lys Glu Gly Trp Gin Arg Gly Asn Cys 
20 25 30 





<210> 


78 










<211> 


31 










<212> 


PRT 










<213> 


Rattus sp. 










<400> 


78 








Cys 


Asn Ala 


Ser Cys Gin Cys 


Ala His Glu Gly Val Cys 


Ser 


Pro 


1 




5 


10 




15 


Thr 


Gly Ala 


Cys Thr Cys Thr 


Pro Gly Trpi Arg Gly Val 


His 


Cys 



20 25 30 





<210> 


79 












<211> 


31 












<212> 


PRT 












<213> 


RattUS Sp. 












<406> 


79 










Cys 


Ala Ser 


Val Cys Asp 


Cys 


Asp 


His Ser Asp 


Gly Cys Asp Pro 


1 




5 






10 


15 


His 


Gly His 


Cys Arg Cys 


Gin 


Ala 


Gly Trp Met 


Gly Thr Arg Cys 



20 25 30 



<210> 80 

<211> 31 

<212> PRT 

<213> Rattus sp. 



<400> 80 

Cys Ser Asn Ala Cys Thr Cys Lys Asn Gly Gly Thr Cys Val Pro Glu 

1 5 10 IS 

Asn Gly Asn Cys Val Cys Ala Pro Gly Phe Arg Gly Pro Ser Cys 
20 25 30 

<210> 81 

<211> 30 

<212> PRT 

<213> Rattus sp. 

<400> 81 

Cys Val. Pro Cys Lys Cys Asn Asn His Ser Ser Cys His Pro Ser Asp 

1 S 10 15 

Gly Thr Cys Ser Cys Leu Ala Gly Trp Thr Gly Pro Asp Cys 
20 25 30 

<210> 82 

<211> 31 

<212> PRT 

<213> Rattus sp. 



<400> 82 

Cys Ser Gin Pro Cys Gin Cys His His Gly Ala Thr Cys His Pro Gin 

15 10 15 

Asp Gly Ser Cys Val Cys lie Pro Gly Trp Thr Gly Pro Asn Cys 
20 25 30 



52 



<210> 83 

<211> 31 

<212> PRT 

<213> Rattus sp- 

<400> 83 

Cys Ser Gin Leu Cys Gin Cys Asp Pro Gly Glu Met Cys His Pro Glu 

1 5 10 15 

Thr Gly Ala Cys Val Cys Pro Pro Gly His Ser Gly Ala His Cys 
20 25 30 

<210> 84 

<211> 40 

<212> PRT 

<213> Rattus Sp. 

<400> 84 

Cys Arg Cys His Asn Gly Gly Leu Cys Asp Arg Phe Thr Gly Gin Cys 

1 5 10 15 

His Cys Ala Pro Gly Tyr lie Gly Asp Arg Cys Arg Glu Glu Cys Pro 

20 25 30 

Val Gly Arg Phe Gly Gin Asp Cys 
35 40 

<210> 85 

<211> 39 

<212> PRT 

<213> Rattus Sp. 

<400> as 

Cys Asp Cys Ala Pro Gly Ala Arg Cys Phe Pro Ala Asn Gly Ala Cys 

15 10 15 

Leu Cys Glu His Gly Phe Thr Gly Asp Arg Cys Thr Glu Arg Leu Cys 

20 25 30 

Pro Asp Gly Tyr Gly Leu Cys 
35 

<210> 86 

<211> 42 

<212> PRT 

<213> Rattus Sp. 

<400> 86 

Cys Thr Cys Asp Pro Glu His Ser Leu Ser Cys His Pro Met His Gly 

1 5 10 15 

Glu Cys Ser Cys Gin Pro Gly Trp Ala Gly Leu His Cys Asn Glu Ser 

20 25 30 

Cys Pro Gin Asp Thr His Gly Ala Gly Cys 
35 40 

<210> 87 

<211> 40 

<212> PRT 

<213> Rattus sp, 

<400> 87 

Cys Leu Cys Leu His Gly Gly Val Cys Leu Ala Asp Ser Gly Leu Cys 
1 5 10 15 
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Arg 


Cys Ala 


Pro Gly Tyr Thr Gly Pro 


His 


Cys Ala Asn 


Leu Cys 


Pro 






20 25 






30 




Pro 


Asn Thr 


Tyr Gly lie Asn Cys 












35 


40 












<210> 


88 












<211> 


40 












<212> 


PRT 












<<i J. J > 


Rattus sp . 












<400> 


88 










Cys 


Ser Cys 


Glu Asn Ala lie Ala Cys Ser Pro Val Asp Gly Thr Cys 


1 




5 


10 




15 




lie 


Cys Lys 


Glu Gly Trp Gin Arg Gly Asn Cys Ser Val 


Pro Cys 


Pro 






20 25 






30 




Pro 


Gly Thr 


Tirp Gly Phe Ser Cys 












35 


40 












<2ao> 


.89 












<211> 


40 












<212> 


PRT 












<213> 


Rattus sp. 












<400> 


89 










Cys 


Gin Cys 


Ala His Glu Gly Val Cys 


Ser 


Pro Gin Thr Gly Ala 


Cys 


1 




5 


10 




IS 




Thr 


Cys Thr 


Pro Gly Trp Arg Gly Val 


His 


cys Gin Leu 


Pro Cys 


Pro 






20 25 






30 




Lys 


Gly Gin 


Phe Gly Glu Gly Cys 












35 


40 












<210> 


90 












<211> 


40 












<212> 


PRT 












<213> 


Rattus sp, 












<400> 


90 










Cys 


Asp Cys 


Asp His Ser Asp Gly Cys Asp 


Pro Val His 


Gly His 


Cys 


1 




5 


10 




15 




Arg 


Cys Gin 


Ala Gly Trp Met Gly Thr Arg 


Cys His Leu 


Pro Cys 


Pro 






20 25 






30 




Glu 


Gly Phe 


Trp Gly Ala Asn Cys 












35 


40 












<210> 


91 












<211> 


40 












<212> 


PRT 












<213> 


Rattus sp» 












<400> 


91 










Cys 


Thr Cys 


Lys Asn Gly Gly Thr Cys 


Val 


Pro Glu Asn 


Gly Asn 


Cys 


1 




5 


10 




15 




Val 


Cys Ala 


Pro Gly Phe Arg Gly Pro 


Ser 


Cys Gin Arg 


Pro Cys 


pro 






20 25 






30 




Pro 


Gly Arg 


Tyr Gly Lys Arg Cys 












35 


40 











<210> 92 
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<211> 


40 












<212> 


PRT 












<213> 


Rattus sp. 












<400> 


92 










Cys 


Lys Cys 


Asn Asn His Ser Ser 


Cys 


His 


Pro 


Ser Asp Gly Thr Cys 


1 




5 




10 




15 


Ser 


Cys Leu 


Ala Gly Trp Thr Gly 


Pro 


Asp 


Cys 


Ser Glu Ser Cys Pro 






20 


25 






30 


Pro 


Gly His 


Trp Gly Leu Lys Cys 












35 


40 












<210> 


93 












<211> 


40 












<212> 


PRT 












<213> 


Rattus sp. 












<400> 


93 










Cvs 


Gin Cys 


His His Gly Ala Thr Cys His 
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15 




Val 


Leu 


Gin 


Pro 


Leu 


Glu 


Gly Asp Leu Cys 


Tyr Ala Asp 


Leu Thr 


Leu 








20 






25 




30 




Gin 


Leu 


Ala 


Gly 


Thr 


Ser 


Pro Arg Lys Ala 


Thr Thr Lys 


Leu Ser 


Ser 






35 








40 


45 






Ala 


Gin 


val 


Asp 


Gin 


Val 


Glu Val Glu Tyr 


Val Thr Met 


Ala Ser 


Leu 




50 










55 


eo 






Pro 


Lys 


Glu 


Asp 


lie 


Ser 


Tyr Ala Ser Leu 


Thr Leu Gly Ala Glu Asp 


65 










70 




75 




80 


Gin 


Glu 


Pro 


Thr 


Tyr 


Cys 


Asn Met Gly His 


Leu Ser Ser 


His Leu 


Pro 










85 




90 




95 




Gly 


Arg 


Gly 


Pro 


Glu 


Glu 


Pro Thr Glu Tyr 


Ser Thr He 


Ser Arg 


Pro 








100 






105 




110 





<210> 132 

<211> 21 . 

<212> PRT 

<213> Homo sapiens 

<400> 132 

Met Asp His Cys Gly Ala Leu Phe Leu Cys Leu Cys Leu Leu Thr Leu 

1 5 10 15 

Gin Asn Ala Thr Thr 

20 





<210> 


133 
























<211> 


507 
























<212> 


PRT 
























<213> 


Homo sapiens 


















<400> 


133 






















Glu 


Thr Trp 


Glu 


Glu 


Leu 


Leu 


Ser 


Tyr Met 


Glu Asn 


Met 


Gin 


Val 


Ser 


1 






5 








10 








15 




Arg 


Gly Arg 


Ser 


Ser 


Val 


Phe 


Ser 


Ser Arg 


Gin Leu 


His 


Gin 


Leu 


Glu 






20 










25 






30 






Gin 


Met Leu 


Leu 


Asn 


Thr 


Ser 


Phe 


Pro Gly 


Tyr Asn Leu 


Thr 


Leu 


Gin 




35 










40 






45 








Thr 


Pro Thr 


He 


Gin 


Ser 


Leu 


Ala 


Phe Lys 


Leu Ser 


Cys 


Asp 


Phe 


Ser 




SO 








55 






60 










Gly 


Leu Ser 


Leu 


Thr 


Ser 


Ala 


Thr 


Leu Lys 


Arg Val 


Pro Gin Ala Gly 
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65 








70 










75 








80 


Gly Gin His 


Ala 


Arg Gly 


Gin 


His 


Ala 


Met 


Gin 


Phe 


Pro 


Ala Glu 


Leu 










85 








90 








95 




Thr Arg Asp 


Ala 


Cys Lys 


Thr 


Arg 


Pro 


Arg 


Glu 


Leu 


Arg 


Leu He 


Cys 








100 








105 










110 




lie 


Tyr 


Phe 
115 


Ser 


Asn Thr 


His 


Phe 
120 


Phe 


Lys 


Asp 


Glu 


Asn 
125 


Asn Ser 


Ser 


Leu 


Leu 


Asn 


Asn 


Tyr Val 


Leu 


Gly Ala Gin 


Leu 


Ser 


His 


Gly His 


Val 




130 








135 










140 








Asn 


Asn 


Leu 


Arg Asp Pro 


Val 


Asn 


He 


Ser 


Phe 


Trp 


His 


Asn Gin 


Ser 


145 








ISO 










155 








160 


Leu 


Glu 


Gly 


Tyr 


Thr Leu 


Thr 


Cys 


Val 


Phe 


Trp 


Lys 


Glu 


Gly Ala Arg 










165 








170 








175 




Lys 


Gin 


Pro 


Trp 


Gly Gly Trp 


Ser 


Pro 


Glu 


Gly Cys 


Arg 


Thr Glu 


Gin 








180 








185 










190 




Pro 


Ser 


His 


Ser 


Gin Val 


Leu 


Cys 


Arg Cys 


Asn 


His 


Leu 


Thr Tyr Phe 






195 








200 










205 






Ala 


Val 
210 


Leu 


Met 


Gin Leu 


Ser 
215 


Pro 


Ala 


Leu 


Val 


Pro 
220 


Ala 


Glu Leu 


Leu 


Ala 


Pro 


Leu 


Thr Tyr He Ser Leu Val Gly 


Cys 


Ser 


He 


Ser lie 


Val 


225 








230 










235 








240 


Ala 


Ser 


Leu 


He 


Thr Val 


Leu 


Leu 


His 


Phe 


His 


Phe 


Arg 


Lys Gin Ser 










245 








250 








255 




Asp 


Ser 


Leu 


Thr 
260 


Arg He 


His 


Met 


Asn 
265 


Leu 


His 


Ala. 


Ser 


Val Leu 
270 


Leu 


Leu 


Asn 


He 
275 


Ala 


Phe Leu 


Leu 


Ser 
280 


Pro 


Ala 


Phe 


Ala 


Met 
285 


Ser Pro 


Val 


Pro Gly Ser 


Ala 


Cys Thr 


Ala 


Leu 


Ala 


Ala 


Ala 


Leu 


His 


Tyr Ala 


Leu 




290 








295 










300 








Leu 


Ser 


Cys 


Leu 


Thr Trp 


Met 


Ala 


He 


Glu 


Gly 


Phe 


Asn 


Leu Tyr 


Leu 


305 








310 










315 








320 


Leu 


Leu 


Gly 


Arg 


Val Tyr 
325 


Asn 


He 


Tyr 


He 
330 


Arg 


Arg 


Tyr 


Val Phe 
335 


Lys 


Leu Gly Val 


Leu 


Gly Trp Gly 


Ala 


Pro 


Ala 


Leu 


Leu 


Val 


Leu Leu 


Ser 








340 








345 










350 




Leu 


Ser 


Val 


Lys 


Ser Ser 


Val 


Tyr Gly Pro 


Cys 


Thr 


He 


Pro Val 


Phe 






355 








360 










365 






Asp 


Ser 


Trp 


Glu 


Asn Gly Thr 


Gly 


Phe 


Gin 


Asn 


Met 


Ser 


He Cys 


Trp 




370 








375 










380 








Val 


Arg 


ser 


Pro 


Val Val 


His 


Ser 


Val 


Leu 


Val 


Met 


Gly 


Tyr Gly Gly 


385 








390 










395 








400 


Leu 


Thr 


Ser 


Leu 


Phe Asn 
405 


Leu 


Val 


Val 


Leu 
410 


Ala 


Trp 


Ala 


Leu Trp 
415 


Thr 


Leu Arg Arg 


Leu Arg Glu Arg Ala Asp Ala 


Pro 


Ser 


Val 


Arg Ala 


Cys 








420 








425 










430 




His 


Asp 


Thr 


Val 


Thr Val 


Leu 


Gly Leu 


Thr 


Val 


Leu 


Leu 


Gly Thr Thr 






435 








440 










445 






Trp 


Ala 


Leu 


Ala 


Phe Phe 


Ser 


Phe Gly Val 


Phe 


Leu 


Leu 


Pro Gin 


Leu 




450 








455. 










460 








Phe 


Leu 


Phe 


Thr 


He Leu 


Asn 


Ser 


Leu 


Tyr 


Gly Phe 


Phe 


Leu Phe 


Leu 


465 








470 










475 








480 


Trp 


Phe 


Cys 


Ser 


Gin Arg Cys 


Arg 


Ser 


Glu 


Ala 


Glu 


Ala 


Lys Ala 


Gin 










485 








490 








495 




He 


Glu 


Ala 


Phe 


Ser Ser 


Ser 


Gin 


Thr 


Thr 


Gin 











500 505 



<210> 134 
<211> 223 
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<212> 


PRT 














<213> 


Homo sapiens 








<400> 


134 














Qlu Thr Trp 


Glu 


Glu 


Leu 


Leu Ser Tyr Met Glu 


Asn 


Met Gin Val Ser 


1 




5 






.10 




15 


Arg Gly Arg 


Ser 


Ser 


Val 


Phe Ser Ser Arg Gin 


Leu 


His Gin Leu Glu 




20 








25 




30 


Gin Met Leu 


Leu 


Asn 


Thr 


Ser Phe 


Pro Gly Tyr 


Asn 


Leu Thr Leu Gin 


35 








40 






45 


Thr Pro Thr 


lie 


Gin 


Ser 


Leu Ala 


Phe Lys Leu 


Ser 


Cys Asp Phe Ser 


50 








55 




60 




Gly Leu Ser 


Leu 


Thr 


Ser 


Ala Thr 


Leu Lys Arg 


Val 


Pro Gin Ala Gly 


65 






f u 




75 




80 


Gly Gin His 


Ala 


Arg 




Gin His 


Ala Met Gin 


Phe 


Pro Ala Glu Leu 






85 






90 




95 


Thr Arg Asp 


Ala 


Cys 


Lys 


Thr Arg 


Pro Arg Glu 


Leu 


Arg Leu lie Cys 




100 








105 




110 


lie Tyr Phe 


Ser 


Asn 


Thr 


His Phe 


Phe Lys Asp 


Glu 


Asn Asn Ser Ser 


115 








120 






12 5 


Leu Leu Asn 


Asn 


Tyr 


Val 


Leu Gly Ala Gin Leu 


Ser 


His Gly His Val 


130 








135 




140 




Asn Asn Leu 


Arg 


Asp 


Pro 


Val Asn 


lie Ser Phe 


Trp 


His Asn Gin Ser 


145 






ISO 




155 




160 


Leu Glu Gly 


Tyr 


Thr 


Leu 


Thr Cys 


Val Phe Trp 


Lys 


Glu Gly Ala Arg 






165 






170 




175 


Lys Gin Pro 


Trp 


Gly 


Gly 


Trp Ser Pro Glu Gly 


Cys Arg Thr Glu Gin 




180 








185 




190 


Pro Ser His 


Ser 


Gin 


Val 


Leu Cys 


Arg Cys Asn 


His 


Leu Thr Tyr Phe 


195 








200 






205 


Ala Val Leu 


Met 


Gin 


Leu 


Ser Pro 


Ala Leu Val 


Pro 


Ala Glu Leu 


210 








215 




220 





<210> 135 

<211> 25 ■ 
<212> PRT 

<213> Homo sapiens 
<400> 135 

Leu Ala Pro Leu Thr Tyr He Ser Leu Val Gly Cys Ser He Ser He 

15 10 15 

Val Ala Ser Leu He Thr Val Leu Leu 
20 25 

<210> 136 
<i211> 20 
<212> PRT 

<213> Homo sapiens 
<400> 136 

Leu His Ala Ser Val Leu Leu Leu Asn He Ala Phe Leu Leu Ser Pro 

1 5 10 15 

Ala Phe Ala Met 
20 

<210> 137 

<2ia> 21 

<212> PRT 
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<213> Homo sapiens 



<400> 137 

Tyr Ala Leu Leu Ser Cys Leu Thr Trp Met Ala lie Glu Gly Phe Asn 

1 5 10 15 

Leu Tyr Leu Leu Leu 
20 



<210> 138 
<211> 19 
<212> PRT 

<213> Homo sapiens 



<400> 138 
Leu Gly Val Leu Gly Trp Gly Ala 

1 5 
Leu Ser Val 



Pro Ala Leu Leu Val Leu Leu Ser 
10 15 



<210> 139 

<211> 25 

<212> PRT. 

<213> Homo sapiens 

<400> 139 
Val Leu Val Met Gly Tyr Gly Gly 

1 5 • 

Val Leu Ala Trp Ala Leu Trp Thr 
20 



Leu Thr Ser Leu Phe Asn Leu Val 
10 15 

Leu 
25 



<210> 140 
<211> 21 
<212> PRT 

<213> Homo sapiens 



<400> 140 

Val Thr Val Leu Gly Leu Thr Val Leu Leu Gly Thr Thr Trp Ala Leu 

1 S 10 15 

Ala Phe Phe Ser Phe 
20 



<210> 141 
<211> 20 
<212> PRT 

<213> Homo sapiens 



<400> 141 

Leu Phe Leu Phe Thr lie Leu Asn Ser Leu Tyr Gly Phe Phe Leu Phe 

15 10 15 

Leu Trp Phe Cys 
20 



<210> 142 
<211> 24 
<212> PRT 

<213> Homo sapiens 



<400> 142 
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Ser Gin Arg Cys Arg Ser Glu Ala Glu Ala Lys Ala Gin lie Glu Ala 

1 5 10 15 

Phe Ser Ser Ser Gin Thr Thr Gin 
20 

<210> 143 

<211> 16 

<212> PRT 

<213> Homo sapiens 

<400> 143 

Ser Pro Val Pro Gly Ser Ala Cys Thr Ala Leu Ala Ala Ala Leu His 
1 5 10 15 

<210> 144 

<211> 37 

<212> PRT 

<213> Homo sapiens 

<400> 144 

Lys Ser ser Val Tyr Gly Pro Cys Thr He Pro Val Phe Asp Ser Trp 

1 5 10 IS 

Glu Asn Gly Thr Gly Phe Gin Asn Met Ser He Cys Trp Val Arg Ser 

20 25 30 

Pro Val val His Ser 



35 




<210> 


145 


<21X> 


7 


<212> 


PRT 


<213> 


Homo sapiens 


<400> 


145 


Gly Val Phe 


Leu Leu Pro Gin 



<210> 146 

<211> 17 

<212> PRT 

<213> Homo sapiens 

<400> 146 

His Phe His Phe Arg Lys Gin Ser Asp Ser Leu Thr Arg He His Met 

15 10 15 

Asn 



<210> 147 

<211> 14 

<212> PRT 

<213> Homo sapiens 

<400> 147 

Gly Arg Val Tyr Asn He Tyr He Arg Arg Tyr Val Phe Lys 
1 5 10 

<210> 148 
<211> 18 
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<212> PRT 

<213> Homo sapiens 

<400> 148 

Arg Arg Leu Arg Glu Arg Ala Asp Ala Pro Ser Val Arg Ala Cys 

1 5 10 15 

Asp Thr 
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Claims Nos.: 
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3. Claims Nos.: 
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invendve concept within the meaning of PCT Rule 13.1. 



Form PCT/tSA/210 (extra sheet) (July 1998)* 



BNSDOCID: <WO 0l00673A1J_> 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER I'HE FAI EIN I CUUFJtKAl lUIN IKJiiAi ^ (.rc i ; 



REVISED VERSION 



(19) World Intellectual Property Organization 

International Bureau 

(43) International Publication Date 
4 January 2001 (04.01.2001) 




PCX 



lilll 



(10) International Publication Number 

wo 01/00673 Al 



(51) International Patent Classification^: 
C07H 21/04, C12N 15/63, C12P 21/02 



C07K 14/47, 



(21) International Application Number: PCT/USOO/18198 

(22) International Filing Date: 29 June 2000 (29.06.2000) 

(25) Filing Language: English 

(26) Publication Language: English 



(30) Priority Data: 
09/345,464 



30 June 1999 (30.06.1999) US 



(71) Applicant: MILLENNIUM PHARMACEUTICALS, 
INC [US/US]; 75 Sidney Street, Cambridge, MA 02139 
(US). 

(72) Inventors: BARNES, Thomas, M.; 22 Hanson Street 
#2, Boston, MA 02118 (US). FRASER, Christopher, 
C*; 52 Grassland Street^ Lexington, MA 0242 r (US). 
WRIGHTON, Nicholas; 18 Lloyd Street, Winchester, 
MA 01890- (US). MYERS, Paul; 14 ComeHus Way, 
Cambridge, MA 02141 (US). BUSFIELD, Samantha, 
J.; Apartment 1, 15 Trowbridge Street, Cambridge, MA 
02138 (US). SHARP, John, D.; 245 Park Avenue, Arling- 
ton, MA 02476 (US). 

(74) Agents: CORUZZI, Laura, A, et al.; Peimie & Edmonds 
LLP, 1 155 Avenue of the Americas, New York. NY 10036 
(US). 



(81) Designated States (national)-. AE, AG, AL, AM, AT. AU, 
AZ. BA. BE, BG, BR. BY. CA, CH. CN. CR, CU, CZ. DE, 
DK. DM, DZ, EE. ES. FI. GB. GD, GE. GH. GM. HR. HU. 
BD. IL. m, IS. JP, KE. KG. KP, KR. KZ, LC, LK. LR. LS. 
LT, LU, LV, MA. MD, MG, MK. MN. MW, MX. NO, NZ. 
PL, PT, RO, RU. SD. SE. SG. SI. SK, SL, TJ. TM. TR. TT. 
TZ, UA, UG. UZ, VN, YU, ZA, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW. MZ, SD, SL, SZ, TZ, UG, ZW), Eurasian 
patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), European 
patent (AT, BE, CH, CY, DE, DK, ES, H, FR, GB, GR, IE, 
IT, LU, MC, NL. PT, SE). OAPI patent (BF, BJ. CF, CG, 
CI, CM, GA, GN, GW, ML. MR, NE, SN, TD. TG). 

Published: 

— With international search report. 

(85) Date of publication of the revised international search 
report: 29 March 2001 

(15) Information about Correction: 

see PCT Gazette No. 13/2001 of 29 March 2001, Section 
H 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations " appearing at the begin" 
ning of each regular issue of the PCT Gazette. 



1-H 



(54) Title: MEMBRANE-ASSOCL^TED AND SECRETED PROTEINS AND USES THEREOF 

2 (57) Abstract: The invention provides isolated nucleic acid molecules, designated INTERCEPT 340, MANGO 003, MANGO 347, 
— TANGO 272. TANGO 295, TANGO 354, and TANGO 378 which encode whoUy secreted or membrane-associated proteins. The 
^ invention also provides antisense nucleic acid molecides, expression vectors containing the nucleic acid molecules of the invention, 
host cells into which the expression vectors have been introduced, and non-human transgenic animals in which a nucleic acid mole- 
^ cule of the invention has been introduced or disrupted. The invention still further provides isolated polypeptides, fusion polypeptides. 
^ antigenic peptides and antibodies. Diagnostic, screenmg and therapeutic methods utilizing compositions of the invention are also 
^ provided. 
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1. I I Claims Nos.: 

because they relate to subject matter not required to be searched by this Authority, namely: 

Claims Nos.: 

because th^ relate to {NUts of the international application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out* specifically: 

3. Q Claims Nos.: 

because they are dependent claims and arc not <hafted in accordance with the second and third sentences of Rule 6.4(a). 

Box II Observationa where unity of invention is lacking (Continuation of it^tn 2 of first sheet) 
This International Searching Authority found multiple inventions in this international application, as follows: 
Please See Extra Sheet. 



1 . [ I As all required additional search fees were timely paid by the applicant, this international search report covers all searchable 
^"""^ claims. 

^* 1 1 As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 
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3 . I I As only some of the required additional search fees were timely paid by the applicant, this international search report covers 
only those claims for which fees were paid, specifically claims Nos.: 



4. j x| No required additional search fees were timely paid by the applicant. Consequently, this international search report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 
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BOX II. OBSERVATIONS WHERE UNITY OP INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

This application contains the following inventionf or groups of inventions which are not so linked as to form a single 
inventive concept under PCT Rule 13.1. In order for all inventions to be searched, the appropriate additional search 
fees must be paid. 

Group I, claim(s)l-10 and 12, in so far as they arc drawn to Intercept 340, polynucleotides of SEQ ID NOS: 1 and 3, 
vector, host cell, method of producing a protein recombinantly and protein of SEQ ID NO: 2. 

Groups II-VII, claim(s) 1*10 and 12, in so far as they are drawn to the next six polynucleotides of distinct cDNA clones 
and encoded proteins, identified as Mango 003, Mango 347, Tango 272, Tango 295, Tango 354 and Tango 378, as 
listed in Tables 1 and 2. 

Groups vni*XIV, c]aim(s) 11 and 15, in so far as they are drawn to antibodies to one of the seven proteins listed 
above* 

Groups XV-XXI, claims 13, 14, 19, 20 and 22, in so far as they are drawn to a method for detecting the presence of in 
a sample or identifying a compoimd which binds to or modulates the activity of a polypeptide of one of the seven 
proteins listed above. 

Groups XXII»XXVII, claims 16 and 17, in so far as they are drawn to a method for detecting the nucleic acids of one 
of the seven cDNA clones listed above. 

Groups XXIX-XXXV, claim 18, in so far as it is drawn to a kit comprising a compound of unspecified constitution 
which selectively binds to a nucleic acid molecule of the seven cDNA clones listed above. 

Groups XXXVI-XLII, claim 21, in so far as it is drawn to a method for moduhiting the activity of one of the seven 
protduis listed above. 

The inveniiuiio listed as Groups I-XLII do not relate to a single inventive concept under PCT Rule 13.1 because, under 
PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: Group I 
corresponds to the first invention wherein the first product is the polynucleotide and the first method of using is the 
method of making the protein. Note that there is no method of making the polynucleotide. The invention also includes 
the protein made. Each of groups H-VII does not share the same or corresponding special technical feature because 
each group is drawn to a different polynucleotide and encoded protein, and each of groups VIU-XLII does not share the 
same or corresponding special technical feature because each group is drawn to different compounds or methods of 
using the seven polynucleotides and encoded proteins. This Authority therefore considers that the several inventions do 
not share a special technical feature within the meaning of PCT Rule 13.2 and thus do not relate to a single general 
inventive concept within the meaning of PCT Rule 13.1. 
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MEMBRANE-ASSOCIATED AND SECRETED PROTEINS 
AND USES THEREOF 



This application claims priority to co-pending U.S. Application No. 09/345^464, 
filed June 30, 1999, the entire contents of which axe incorporated herein by reference in its 
^ entirety. 

Backgrdnnd of the Invention 

Many secreted proteins, for example, cytokines, play a vital role in the regulation of 
cell growth, cell differentiation, and a variety of specific cellular responses. A nimiber of 

10 medically useful proteins, including erythropoietin, granulocyte-macrophage colony 

stimulating factor, human growth hormone, and various interleukins, are secreted proteias. 

Many membrane-associated proteins are receptors which bind a ligand and 
Ixansduce an intracellular signal, leading to a variety of cellular responses. The 
identification and characterizatioji of such a receptor enables one to identify both the 

15 ligands which bind to the receptor and the intracellular molecules and signal transduction 
pathways associated with the receptor, permitting one to identify or design modulators of 
receptor activity, e,g., receptor agonists or antagonists and modulators of signal 
transduction. 

Thus, an important goal in the design and development of new therapies is the 
20 identification and characterization of membrane-associated and secreted proteins and the 
genes which encode them. 

Summary of the Invention 

The present invention is based, at least in part, on the discovery of cDNA molecules 
25. encoding INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378 all of which are either wholly secreted or transmembrane 
proteins. These proteins, fragments, derivatives, and variants thereof are collectively 
referred to as "polypeptides of the invention" or "proteins of the invention." Nucleic acid 
molecules encoding the polypeptides or proteins of the invention are collectively referred to 
30 as "nucleic acids of the invention." 

The nucleic acids and polypeptides of the present invention are useful as modulating 
agents in. regulating a variety of cellular processes. Accordingly, in one aspect, tliis 
invention provides isolated nucleic acid molecules encoding a polypeptide of the invention 
or a biologically active portion thereof. The present invention also provides nucleic acid 
35 molecules which are suitable for use as primers or hybridization probes for the detection of 
nucleic acids encoding a polypeptide of the invention. 
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The invention features nucleic acid molecules which are at least 45% (or 55%, 65%, 
75%, 85%, 95%, or 98%) identical to the nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 
9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the nucleotide sequence of the 
cDNA insert of a clone deposited with ATCC® as Accession Number 207178 (the "cDNA 
of ATCC® Accession Number 207178"), the nucleotide sequence of the cDNA insert of a 
^ clone deposited with ATCC® as Accession Number PTA-249 (the "cDNA of ATCC® 
Accession Nimiber PTA-249"), or the nucleotide sequence of the cDNA insert of a clone 
deposited with ATCC® as Accession Number PTA-250 (the "cDNA of ATCC® Accession 
Number PTA-250"), or a complement thereof 

The invention features nucleic acid molecules which include a fragment of at least 
300 (325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 800, 900, 1000, 1200, 1400^ 
1600, 1800, 2000, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 3800, or 4000) nucleotides of 
the nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 
24, 25, 27, 28 or 30, the nucleotide sequence of the cDNA of ATCC® Accession Number 
207178, the nucleotide sequence of the cDNA of ATCC® Accession Number PTA-249, or 
^ the nucleotide sequence of the cDNA of ATCC® Accession Number PTA-250, or a 
complement thereof 

The invention also features nucleic acid molecules which include a nucleotide 
sequence encoding a protein having an amino acid sequence that is at least 45% (or 55%, 
65%, 75%, 85%, 95%, or 98%) identical to the amino acid sequence of SEQ ID N0s:2, 5, 8, 
11, 14, 17, 20, 23, 26, or 29, the atnino acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence eticoded by the cDNA of ATCC® 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-250. 

In preferred embodiments, the nucleic acid molecules have the nucleotide sequence 
of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the 
nucleotide sequence of the cDNA of ATCC® Accession Number 207178, the nucleotide 
sequence of the cDNA of ATCC® Accession Number PTA-249, or the nucleotide sequence 
of the cDNA of ATCC® Accession Number PTA-250, or a complement thereof 

Also wiliiin the invention are nucleic acid molecules which encode a fragment of a 
polypeptide having the amino acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, 
or 29, or a fragment including at least 15 (25, 30, 50, 100, 150, 300, 400, 500, 600, 700, 
800, 900, 1000, 1100, 1200, 1300, or 1400) contiguous amino acids of SEQ DD NOs:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence encoded by the cDNA of ATCC® 
^ ^ Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-250. 



The invention includes nucleic acid molecules which encode a naturally occuiring 
allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID N0s:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC® 
^ Accession Number PTA-250, wherein the nucleic acid molecule hybridizes to a nucleic acid 
molecule consisting of a nucleic acid sequence encoding SEQ ID NOs:2, 5, 8> 1 1, 14, 17, 
20, 23, 26, or 29, the nucleotide sequence of the cDNA of ATCC® Accession Number 
207178, the nucleotide sequeaice of the cDNA of ATCC® Accession Number PTA-249, or 
the nucleotide sequence of the cDNA of ATCC^ Accession Number PTA-250, or a 
complement thereof under stringent conditions* 

Also within the invention are isolated polypeptides or proteins having an amino acid 
sequence that is at least about 60%, preferably 65%, 75%, 85%, 95%, or 98% identical to 
the amino acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, the amino 
acid sequence encoded by the cDNA of ATCC® Accession Number 207178, the amino acid 
sequence encoded by the cDNA of ATCC® Accession Number PTA-249, or the amino acid 
sequence encoded by the cDNA of ATCC® Accession Number PTA-250. 

Also within the invention are isolated polypeptides or proteins which are encoded by 
a nucleic acid molecule having a nucleotide sequence that is at least about 60%, preferably 
65%, 75%, 85%, or 95% identical the nucleic acid sequence encoding SEQ ID NOs:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, and isolated polypeptides or proteins which are encoded by a 
nucleic acid molecule having a nucleotide sequence which hybridizes under stringent 
hybridization conditions to a nucleic acid molecule having the nucleotide sequence of SEQ . 
ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or 
complement thereof, the non-coding strand of the cDNA of ATCC® Accession Number 
207178, the non-coding strand of the cDNA of ATCC® Accession Number PTA-249, or the 
non-coding strand of the cDNA of ATCC® Accession Number PTA-250. 

Also within the invention are polypeptides which are naturally occurring allelic 
variants of a polypeptide that includes the amino acid sequence of SEQ ID NOs:2, 5, 8, 1 1, 
14, 17, 20, 23, 26, or 29, the amnio acid sequence encoded by the cDNA of ATCC® 
Accession Number 207178, the amino acid sequence encoded by the cDNA of ATCC® 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA of ATCC^ 
Accession Niunber PTA-250, wherein the polypeptide is encoded by a nucleic acid 
molecule which hybridizes to a nucleic acid molecule having the sequence of SBQ ID 
NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a complement 
thereof, under stringent conditions. Suchallelicvariantdiffer at 1%,2%, 3%,4%,or5%of 
the amino acid residues. 
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The invention also features nucleic acid molecules that hybridize under stringent 
conditions to a nucleic acid molecule having the nucleotide sequence of SBQ ID NOs: 1, 3, 
4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the cDNA of ATCC® 
Accession Number 207178, the cDNA of ATCC® Accession Number PTA-249, or the 
cDNA of ATCC® Accession Number PTA-250, or a complement thereof In other 

^ embodiments, the nucleic acid molecules are at least 300 (325, 350, 375, 400, 425, 450, 
500, 550, 600, 650, 700, .800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600, 
2800, 3000, 3200, 3400, 3600, 3800, 4000, or 4200) nucleotides in length and hybridize 
under stringent conditions to a nucleic acid molecule consisting of the nucleotide sequence 
of SBQ ID NOsrl, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, the 

10 cDNA of ATCG® Accession Number 207178, the cDNA of ATCC® Accession Number 
PTA-249, or the cDNA of ATCC® Accession Number PTA-250, or a complement thereof 
In other embodtments, the isolated nucleic acid molecules encode an extracellular, 
transmembrane, or cytoplasmic domain of a polypeptide of the invention. 

In another embodiment, the invention provides an isolated nucleic acid molecxxle 

1^ which is antisense to the coding strand of a nucleic acid of the invention. 

Another aspect of the invention provides vectors, e.g.^ recombinant expression 
vectors, comprising a nucleic acid molecule of the invention. In another embodiment, the . 
invention provides host cells containing such a vector or a nucleic acid molecule of the 
mvention. The invention also provides methods for producing a polypeptide of the 

^0 invention by culturing, in a suitable medium, a host cell of the invention containing a 
recombinant expression vector such that a polypeptide is produced. 

Another aspect of this invention features isolated or recombinant proteins and 
polypeptides of the invention. Preferred proteins and polypeptides possess at least one 
biological activity possessed by the corresponding naturally-occurring human polypeptide. 
An activity, a biological activity, or a functional activity of a polypeptide or nucleic acid of 
the invention refers to an activity exerted by a protein, polypeptide or nucleic acid molecule 
of the invention on a responsive cell as determined in vivo, or in vitro^ according to standard 
techniques. Such activities can be a direct activity, such as an association with or an 
enzymatic activity on a second protein or an indirect activity, such as a cellular signaling 

on 

activity mediated by interaction of the protein with a second protein. 

In one embodiment, the isolated polypeptide of the invention lacks both a 
transmembrane and a cytoplasmic domain. In another embodiment, the polypeptide lacks 
both a transmembrane domain and a cytoplasmic domain and is soluble under physiological 
conditions. 

For INTERCEPT 340, biological activities include, e,g., (1) tiie ability to form 
protein-protein interactions with proteins in the signaling pathway of the naturally- 
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occviiTing polypeptide; (2) the ability to bind a Kgand of the naturally-occurring 
polypeptide; (3) the ability to interact with an INfTERCEPT 340 receptor, e.g., a cell surface 
receptor (e.g., an integrin); (4) the ability to modulate the activity of an intracellular 
molecule that participates in a signal transduction pathway, e.g., an intracellular molecule in 
the integrin signalling (e.g., a cdk2 inhibitor); (5) the abiUty to assemble into fibrils; (6) the 
^ ability to strengthen and organize the extracellular matrix; (7) the abihty to modulate the 
shape of tissues and cells; (8) the ability to interact with (e.g , bind to) components of the 
extracellular matrix; and (9) the ability to modulate cell migration. Other activities include 
the ability to modulate function, survival, morphology, migration, proliferation and/or 
differentiation of cells of tissues in wliich it is expressed (e.g., splenic cells). For example, 
additional biological activities of ESTTERCEPT 340 include: (1) the ability to modulate 
splenic cell activity; (2) the ability to modulate skeletal morphogenesis; and/or (3) tlie 
ability to modulate smooth muscle cell proliferation and differentiation. 

For MANGO 003, biological activities include, e.g., (1) the ability to form protein- 
protein (e.g. , protein-ligand) interactions with proteins in the signaling pathway of the 
naturally-occurring polypeptide; (2) the ability to interact with (e.g, bind to) a Ugand of the • 
naturally-occurring polypeptide; (3) the ability to mteract with a MANGO 003 receptor, 
e.g., 2, cell surface receptor; (4) the ability to modulate cell surface recognition; (5) the 
ability to transduce an extracellular signal (e.g, by interacting with a ligand and/or a cell- • 
surface receptor); (6) the ability to modulate a signal transduction pathway; and (7) the 
ability to modulate signal transmission at a chemical synapse. Other activities include the 
ability to modulate function, survival^ morphology, proliferation and/or differentiation of 
cells of tissues in which it is expressed (e.g, thyroid, Uver, skeletal muscle, kidney, heart, 
lung, testis and brain). For example, the activities of MANGO 003 can include modulation 
of endocrine, hepatic, skeletal muscular, renal, cardiovascular, reproductive and/or brain 
function. 

For MANGO 347, biological activities include, e.g., (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the naturally-occurring 
polypeptide; (2) the ability to interact with a ligand of the naturally-occurring polypeptide; 
(3) the ability to interact with a MANGO 347 receptor; and (4) the ability to modulate a 
developmental process, e.g., morphogenesis, cellular migration, adhesion, proHferation, 
differentiation, and^or survival. Other activities include the ability to modulate function, 
survival, morphology, proliferation and/or differentiation of cells of tissues in which it is 
expressed (e.g, brain cells). For example, the activities of MANGO 347 can include 
modulation of neural (e.g, CNS) function. 

For TANGO 272, biological activities include, e.g, (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the naturally-occurring 
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polypeptide; (2) the ability to bind a ligand of the naturally-occurring polypeptide; (3) the 
ability to interact with a TANGO 272 receptor, e,g., a cell surface receptor (e.g., an 
integrin); (4) the ability to modulate cell-cell contact; (5) the abiUty to modulate cell 
attachment; (6) the ability to modulate cell fate; and (7) the ability to modulate tissue repair 
and/or woimd healing. Other activities include the ability to modulate function, survival, 

^ morphology, proliferation and/or differentiation of cells of tissues in which it is expressed 
(e.g., microvascular endothelial cells). For example, the activities of MANGO 347 can 
include modulation of cardiovascular function. 

For TANGO 295, biological activities include, e.g., (1) the ability to form protein- 
protein interactions with proteins ia the signaling pathway of the naturally-occurring 
polypeptide; (2) the ability to bind a ligand of the naturally-occurring polypeptide; (3) the 
ability to interact with a TANGO 295 receptor; (4) the ability to interact with (e.g. , bind to) 
a nucleic acid; and (5) the ability to elicit pyrimidine-specific endonuclease activity. Other 
activities include the ability to modidate function, survival, morphology, proliferation 
and/or differentiation of cells of tissues in which it is expressed (ag",, mammary 

■'^ epithelium). 

For TANGO 354, biological activities include, eg«, (1) the ability to form protein- 
protein interactions with proteins in the signaling pathway of the naturally-occurring 
polypeptide; (2) the ability to bind a Hgand of the naturally-occurring polypeptide; (3) the 
ability to interact with (e.g., bind to) a TANGO 354 receptor, e.g., a cell surface receptor; 
(4) the ability to modulate cell surface recognition; (5) the ability to modulate cellular 
motility, e.g., chemotaxis and/or chemokinesis; (6) the ability to transdtice an extracellular 
signal (e.g*, by interacting with a ligand and/or a cell-surface receptor); and (7) the ability to 
modulate a signal transduction pathway. Other activities include the ability to modulate 
function, survival, morphology, proliferation and/or differentiation of cells of tissues in 
■ which it is expressed (e,g.^ hematopoietic tissues). For example, TANGO 354 biological 
activities can further include: (1) regulation of hematopoiesis; (2) modulation (e.g., 
increasing or decreasing) of haemostasis; (3) modulation of an inflaromatory response; (4) 
modulation of neoplastic growth, e.g., inhibition of tumor growth; and (5) modulation of 
thrombolysis. 

■ For TANGO 378, biological activities include, e.g., (1) the ability to form protem- 

protein interactions with proteins in the signaling pathway of the naturally-occmring 
polypeptide; (2) the ability to bind a ligand of the naturally-occmring polypeptide; (3) the 
ability to interact with a TANGO 378 receptor; (4) the ability to transduce an extracellular 
signal; and (5) the ability to modulate a signal transduction pathway (e.g., adenylate 
cyclase, or phosphatidylinositol 4,5-bisphosphate (PIP2X inositol 1 ,4,5-triphosphate (IP3)). 
Other activities include the ability to modulate function, survival, morphology, proliferation 
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and/or differentiation of cells of tissues in which it is expressed natural killer cells). 
For example, TANGO 378 biological activities can further include the ability to modulate 
ati immune response in a subject, for example, (1) by modulating iihmime cytotoxic 
responses against pathogenic organisms, e,g,, viruses, bacteria, and parasites; (2) by 
modulating organ rejection after transplantation; and (3) by modulating immune recognition 
and lysis of normal and malignant cells. 

In one embodiment, a polypeptide of the invention has an amino acid sequence 
sufficiently identical to an identified domain of a polypeptide of the invention. As used 
herein, the term "sufficiently identical" refers to a first amino acid or nucleotide sequence 
which contains a sufficient or minimum number of identical or equivalent (e.g., with a 
similar side chain) amino acid residues or nucleotides to a second amino acid or nucleotide 
sequence such that the first and second amino acid or nucleotide sequences have a common 
structural domain and/or common functional activity. For example, amino acid or 
nucleotide sequences which contain a common structural domain having about 60% 
idmtity, preferably 65% identity, more preferably 75%, 85%, 95%, 98% or more identity 
are defined herein as sufficiently identical. 

In one embodiment, a MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, or TANGO 378 polypeptide of the invention includes a signal peptide. 

In another embodiment, a nucleic acid molecule of the invention encodes a 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 378 
polypeptide which includes a signal peptide. 

In another embodiment, a MANGO 003, TANGO 272, TANGO 354, or TANGO 
378 polypeptide of the invention includes one or more of the following domains: (1) a 
signal peptide; (2) an N-terminal extracellular domain; (3) a C-terminal transmembrane 
domain; and (4) a cytoplasmic domain. 
■ The polypeptides of the present invention, or biologically active portions thereof, 

can be operably linked to a heterologous amino acid sequence to fomi ftision proteins. In 
one embodiment, the fusion protein consists of a chimeric protein assembled fi-om portions 
of the protein flom different species. 

In one embodiment, the isolated polypeptide of the mvention lacks both a 
transmembrane and a cytoplasmic domain. In another embodiment, the polypeptide lacks 
both a transmembrane domain and a cytoplasmic domain and is soluble under physiological 
conditions. 

The invention further features antibodies that specifically bind a polypeptide of the 
invention such ais monoclonal or polyclonal antibodies. In addition, the polypeptides of the 
^ ^ invention or biologically active portions thereof, or antibodies of the invention, can be 
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incorporated into pharmaceutical compositions, which optionally include pharmaceutically 
acceptable carriers. 

In another aspect^ the present invention provides methods for detecting the presence 
of the activity or expression of a polypeptide of the invention in a biological sample by 
contacting the biological sample with an agent capable of detecting an indicator of activity 
^ such that the presence of activity is detected in the biological sample. 

In another aspect, the invention provides methods for modulating activity of a 
polypeptide of the invention comprising contacting a cell with an agent that modulates 
(inhibits or stimulates) the activity or expression of a polypeptide of the invention such that 
activity or expression in. the cell is modulated. In one embodiment, the agent is an antibody 
that specifically binds to a polypeptide of the mvention. 

In another embodiment, the agent modulates expression of a polypeptide of the 
invention by modulating transcription, splicing, or translation of an mRNA encoding a 
polypeptide of the invention. In yet another embodiment, the agent is a nucleic acid 
molecule having a nucleotide sequence that is antisense to the coding strand of an mRNA 
■'• ^ encoding a polypeptide of the invention. 

The present invention also provides methods to treat a subject having a disorder 
characterized by aberrant activity of a polypeptide of the invention or aberrant expression of 
a nucleic acid of the invention by administering an agent which is a modulator of the 
activity of a polypeptide of the inverition or a modulator of the expression of a nucleic acid 
of the mvention to the subject. In one embodiment, tiie modulator is a protein of the 
invention. In another embodiment, the modulator is a nucleic acid of the invention. In 
other embodiments, the modulator is a peptide, peptidomimetic, or other small organic 
molecule. The present invention also provides diagnostic assays for identifying the presence 
or absence of a genetic lesion or mutation characterized by at least one of: (i) aberrant 
modification or mutation of a gene encoding a polypeptide of the invention, (ii) mis- 
regulation of a gene encoding a polypeptide of the invention, and (iii) aberrant post- 
translational modification of the invention wherein a wild-type form of the gene encodes a 
protein having the activity of the polypeptide of the invention. 

In another aspect, the invention provides a method for identifying a compound that 
binds to or modulates the activity of a polypeptide of the invention. In general, such 
methods entail measuring a biological activity of the polypeptide in the presence and 
absence of a test compound and identifying those compounds which alter the activity of the 
polypeptide. 

The invention also features methods for identifying a compound which modulates 
the expression of a polypeptide or nucleic acid of the iuvention by measuring the expression 
of the polypeptide or nucleic acid in the presence and absence of the compound. 

-8- 
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In yet a further aspect, the invention provides substantially purified antibodies o^ 
fragments thereof including human and non-human antibodies or fragments, thereof which 
antibodies or fragments specifically bind to a polypeptide comprising an amino acid 
sequence selected from the group consisting of: the amino acid sequence of SEQ ED N0s:2, 
5, 8, 11, 14, 17, 20, 23, 26, or 29 or the amino acid sequence encoded by the cDNA insert of 

^ theplasrmddepositedwiththe ATCC® as Accession Number 207178, the ami 
sequence encoded by the cDNA insert of the plasmid deposited with the ATCC® as 
Accession Number PTA-249, or the amino acid sequence encoded by the cDNA insert of 
the plasmid deposited with the ATCC® as Accession Number PTA-250; a fragment of at 
least 15 amino acid residues of the amino acid sequence of SEQ ED NOs:2, 5, 8, 11, 14, 17, 
20, 23, 26, or 29; an amino acid sequence which is at least 95% identical to the amino acid 
sequence of SEQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, wherein the percent identity 
is determined using the ALIGN program of the GCG software package with a PAM120 
weight residue table, a gap length penalty of 12, and a gap penalty of 4; and an amino acid 
sequence which is encoded by a nucleic acid molecule which hybridizes to the nucleic acid 

1^ molecule consisting of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 
25, 27, 28 or 30, under conditions of hybridization of 6X SSC at 45^C and washing in 0.2 X 
SSC, 0.1% SDS at 65*^0. In various embodiments, the substantially purified antibodies of 
the invention, or fragments thereof can be human, non-human, chimeric and/or humanized 
antibodies. 

20 

Any of the antibodies of the invention can be conjugated to a ther^eutic moiety or 
to a detectable substance* Non-limiting examples of detectable substances that can be 
conjugated to the antibodies of the invention are an enzyme, a prosthetic group, a 
fluorescent material, a luminescent material, a bioluminescent material, and a radioactive 
material 

25 

The invention also provides a kit containing an antibody of the invention conjugated 
to a detectable substance, and instructions for use. Still another aspect of the invention is a 
pharmaceutical composition comprising an antibody of the invention and a 
pharmaceutically acceptable carrier. In preferred embodiments, the pharmaceutical 
composition contains an antibody of the invention, a therapeutic moiety, and a 
pharmaceutically acceptable carrier. 

Other features and advantages of the invention will be apparent from the following 
detailed description and claims. 
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Brief Descriptioii of the Drawings 

Figures lA-lB depict the cDNA sequence of human INTERCEPT 340 (SEQ ED 
N0:1) and the predicted amino acid sequence of INTERCEPT 340 (SEQ ID N0:2). The 
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open reading frame of SBQ H) N0:1 extends from nucleotide 1222 to nucleotide 1944 of 
SEQ ID NO:l (SBQ ID NO:3), 

Figure 2 depicts a hydropathy plot of human INTERCEPT 340. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophiUc 
residues are below the dashed horizontal line* The cysteine residues (cys) and potential N- 
^ glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace* Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
INTERCEPT 340 are indicated. The amino acid sequence of each of the fibrillar collagen 
C-terminal domains are indicated by underlining and the abbreviation "COLF". 

Figure 3 depicts an alignment of each of the fibrillar collagen C-terminal domains 
(also referred to herein as "COLF domains") of human INTERCEPT 340 with consensus 
hidden Markov model COLF domains. For each alignment, the upper sequence is the 
consensus amino acid sequence (SEQ ID N0s:3U 32, and 33), while the lower sequence 
amino acid sequence corresponds to amino acid 58 to amino acid 1 16 of SEQ ID NO:2 
(SEQ ID NO:34), amino acid 126 to amino acid 151 of SEQ ID N0:2 (SEQ ID NO:35), and 
amino acid 186 to amino acid 217 of SEQ ID N0:2 (SEQ ID NO:36). 

Figures 4A-4C depict the cDNA sequence of human MANGO 003 (SEQ ID N0:4) 
and the predicted amino acid sequence of MANGO 003 (SEQ ID NO:5). The open reading 
frame of SEQ ID NO:4 extends from nucleotide 57 to nucleotide 1 568 of SEQ ID NO:4 
(SEQIDNO:6). 

OA 

Figure 5 depicts a hydropathy plot of human MANGO 003, Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophilic residues are below 
the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(ISTgly) are indicated by short vertical lines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of MANGO 003 
are indicated. The ammo acid sequence of each of the immunoglobulin domains, and the 
neurotransmitter gated ion channel domain are indicated by underlining and the 
abbreviations "ig" and "neur chan", respectively. 

Figure 6 depicts an alignment of each of the immunoglobulin domains (also referred 
to herein as "Ig domains") of human MANGO 003 with the consensus hidden Markov 
model immunoglobulin domains. For each aligrmient, the upper sequence is the consensus 
sequence (SEQ ID NO:37), while the lower sequence corresponds to amino acid 44 to 
amino acid 101 of SEQ ID N0:5 (SEQ ID NO:38), amino acid 165 to ammo acid 223 of 
SBQ ID N0:5 (SEQ ID NO:39), and amino acid 261 to atnino acid 340 of SEQ ID N0:5 
(SEQIDNO:40). 

Figure 7 depicts an alignment of tiie neurotransmitter gated ion channel domain of 
human MANGO 003 with flie consensus hidden Markov model neurotransmitt^ gated ion 
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channel domain. The upper sequence is the consensus sequence (SEQ ID NO:42)5 while the 
lower sequence corresponds to amino acid 388 amino acid 397 of SEQ TD NO:5 (SEQ ID 
NO:43), 

Figure 8 depicts the cDNA sequence of mouse MANGO 003 (SEQ ID NO:7) and 
the predicted amino acid sequence of MANGO 003 (SEQ E) N0:8). The open reading 
^ frame of SEQ ID N0:7 extends from nucleotide 1 to nucleotide 626 of SEQ ID N0:4 (SEQ 
IDNO:9). 

Figure 9 depicts a hydropathy plot of mouse MANGO 003. Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophilic residues are below 
the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(Ngly) are indicated by short verticallines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of mouse MANGO 
003 are indicated. 

Figure 10 depicts the cDNA sequence of human MANGO 347 (SEQ ID NO: 10) and 
the predicted amino acid sequence of MANGO 347 (SEQ ID NO: 1 1). The open reading 
frame of SEQ ID NO:10 extends from nucleotide 31 to nucleotide 444 of SEQ ID NO:10 
(SEQIDNO:12). 

Figure 11 depicts a hydropathy plot of human MANGO 347. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by 
short verticallines just below the hydropathy trace. Below the hydropathy plot, the 
numbers corresponding to the amino acid sequence of MANGO 347 are indicated The 
amino acid sequence of the CUB domain is indicated by underlining and the abbreviation 
"CUB". 

Figure 12 depicts an alignment of the CUB domain of human MANGO 347 with a 
consensus hidden Markov model CUB domain. The upper sequence is the consensus amino 
acid sequence (SEQ ID NO:44), while the lower sequence corresponds to amino acid 40 to 
amino acid 136 of SEQIDNO:ll (SEQIDNO:45). 

F/gwre^ i5J-i5i? dq)ict the cDNA sequence of human TANGO 272 (SEQ 
NO:13) and the predicted amino acid sequence of TANGO 272 (SEQ rDN0:14). The open 
reading frame of SEQ ID NO: 13 extends from nucleotide 230 to nucleotide 3379 of SEQ ID 
NO:13(SEQIDNO:15). 

Figure 14 depicts a hydropathy plot of human TANGO 272. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
^ ^ glycosylation sites (iSTgly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
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TANGO 272 are indicated. The amino acid sequence of each of the fourteen EGF-Uke 
domains and the delta serrate ligand domain is indicated by underlisaing and the 
abbreviation "EGF-like" and "DSL", respectively. 

Figures ISA-lSC depict an alignment of each of the EGF-like domains of human 
TANGO 272 with consensus hidden Markov model EGF-like domains. The upper 
sequence is the consensus amino acid sequence (SEQ ID NO:46), while the lower sequence 
corresponds to amino acid 151 to amino acid 181 of SEQ BD N0:14 (SEQ E) NO;49); 
amino acid 200 to amino acid 229 of SEQ ID N0:14 (SEQ E) NO:50); amino acid 242 to 
amino acid 272 of SEQ ID N0:14 (SEQ ID N0:51); amino acid 285 to amino acid 315 of 
SEQ ro N0:14 (SEQ ID N0:52); amino acid 328 to amino acid 358 of SEQ ID NO: 14 
(SEQ ID N0:53); amnio acid 378 to amino acid 404 of SEQ ID N0:14 (SEQ ID N0:54); 
amino acid 417 to amino acid 447 of SEQ ID N0:14 (SEQ ID N0:55); amino acid 460 to 
amino acid 490 of SEQ ID NO: 14 (SEQ ID NO:56); amino acid 503 to amino acid 533 of 
SEQ ID NO:14 (SEQ ID NO:57); amino acid 546 to amino acid 576 of SEQ ID NO:14 
(SEQ ID NO:58); amino acid 589 to amino acid 619 of SEQ ID NO:14 (SEQ ID NO:59); 
amino acid 632 to amino acid 661 of SEQ ID NO: 14 (SEQ ID NO:60); amino acid 674 to 
amino acid 704 of SEQ ID N0:14 (SEQ ID NO:61); and amino acid 717 amino acid 747 of 
SEQ ID NO:14 (SEQ ID NO:62). For alignment of the delta serrate ligand domain, the 
upper sequence is the consensus hidden Markov model (SEQ ID NO:47), while the lower 
sequence corresponds to amino acid 518 to amino acid 576 of SEQ ID NO: 14 (SEQ ID 
NO:63). 

Figures 16A-16B depict the cDNA sequence of mouse TANGO 272 (SEQ ID 
NO: 16) and the predicted amino acid sequence of TANGO 272 (SEQ ID NO: 17)* The open 
reading frame of SEQ ID NO: 16 extends from nucleotide 1 to nucleotide 1492 of SEQ ID 
NO:16(SEQIDNO:18). 

Figure 1 7 depicts a hydropathy plot of mouse TANGO 272. Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophiUc residues are below 
the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(Ngly) are indicated by short vertical lines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of mouse TANGO 
272 are indicated. 

Figure 18 depicts the cDNA sequence of human TANGO 295 (SEQ ID NO:22) and 
the predicted amino acid sequence of TANGO 295 (SEQ ID NO:23)« The open reading 
frame of SEQ ID NO:22 extends from nucleotide 217 to nucleotide 684 of SEQ ID NO:28 
(SEQIDNO:24). 

Figure 19 depicts a hydropathy plot of human TANGO 295. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophiUc 
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residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
human TANGO 295 are indicated. The amino acid sequence of the pancreatic ribonuclease 
domain is indicated by underlining and the abbreviation "RNase A". 
^ Figure 20 depicts an alignment of the pancreatic ribonuclease domain of human 

TANGO 295 with a consensus hidden Markov model pancreatic ribonuclease domain. The 
upper sequence is the consensus amino acid sequence (SEQ ID NO:96), while the lower 
sequence corresponds to amino acid 32 to amino acid 156 of SEQ ID NO:23 (SEQ ID 
NO:97). 

Figures 21A-21B depict the cDNA sequence of humaa TANGO 354 (SEQ ID 
NO:25) and the predicted amino acid sequence of TANGO 354 (SEQ ID NO:26), The open 
reading frame of SEQ ID NO:25 extends from nucleotide 62 to nucleotide 976 of SEQ ID 
NO:25(SEQIDNO:27). 

Figure 22 depicts a hydropathy plot of human TANGO 354. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
human TANGO 354 are indicated. The amino acid sequence of the immunoglobulin 

9(1 

domain is indicated by underliiiing and the abbreviation **ig". 

Figure 23 depicts an aligmnent of the immunoglobulin domain of human TANGO 
354 with a consensus hidden Markov model immunoglobulin domains. The upper sequence 
is the consensus amino acid sequence (SEQ ID NO:37), while the lower sequence 
corresponds to amino acid 33 to amino acid 110 of SEQ ID NO:26 (SEQ ID N0:41). 

Figures 24A'-24C depict the cDNA sequence of human TANGO 378 (SEQ ID 
NO:28) and the predicted amino acid sequence of TANGO 378 (SEQ ID NO:29). The open 
reading frame of SEQ ID NO:28 extends from nucleotide 42 to nucleotide 1625 of SEQ ID 
NO:28(SBQIDNO:30). 

Figure 25 depicts a hydropatiby plot of human TANGO 378. Relatively 
hydrophobic residues are above the dashed horizontal line, and relatively hydrophilic 
residues are below the dashed horizontal line. The cysteine residues (cys) and potential N- 
glycosylation sites (Ngly) are indicated by short vertical lines just below the hydropathy 
trace. Below the hydropathy plot, the numbers corresponding to the amino acid sequence of 
human TANGO 378 are indicated. The amino acid sequence of the seven transmembrane 
domain is indicated by underlining and the abbreviation *7tm" 
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Figure 26 depicts an alignment of the seven transmembrane receptor domain of 
human TANGO 378 with a consensus hidden Markov model of this domain. The upper 
sequence is the consensus amino acid sequence (SEQ ID NO:98), while the lower sequence 
corresponds to amino acid 187 to amino acid 515 of SEQ ID NO:29 (SEQ ID NO:99). 

Figures 27A'-27C depict a global alignment between the nucleotide sequence of the 
open reading frame (ORE) of human MANGO 003 (SEQ ID N0:6) and the nucleotide 
sequence of the open reading frame of mouse MANGO 003 (SEQ ID N0:9). The upper 
sequence is the human MANGO 003 OKF nucleotide sequence, while the lower sequence is 
the mouse MANGO 003 ORF nucleotide sequence. These nucleotides sequences share a 
31.1% identity. The global alignment was performed using the ALIGN program version 
2.0u (Matrix file used: pam 120»mat, gap penalties of -12A4 with a global alignment score 
of --1212; Myers andMiller, 1989, Cii?/05 4: 11-7). 

Figures 28A-28B depict a local alignment between the nucleotide sequence of 
human MANGO 003 (SEQ ID NO:4) and the nucleotide sequence of mouse MANGO 003 
(SEQ ID NO:7), The upper sequence is the human MANGO 003 nucleotide sequence^ 
while the lower sequence is the mouse MANGO 003 nucleotide sequence. These 
nucleotides sequences share a 62.8 % identity over nucleotide 970 to nucleotide 2080 of the 
human MANGO 003 sequence (nucleotide 10 to nucleotide 1070 of mouse MANGO 003). 
The local alignment was performed using the L- ALIGN program version 2.0u54 July 1996 
(Matrix file used: pam 120.mat, gap penalties of -12/-4 with a score of 3241; Huang and 
Miller, 1991, Adv, Appl Math 12:373-381). 

Figure 29 depicts a global alignment between the amino acid sequence of human 
MANGO 003 (SEQ ID N0:5) and the amino acid sequence of mouse MANGO 003 (SEQ 
ID NO:8). The upper sequence is the human MAIS'GO 003 amino acid sequence, while the 
lower sequence is the mouse MANGO 003 amino acid sequence. These amino acid 
sequences share a 30. 1 % identity. The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12A4 witii a global 
alignment score of -488; Myers and Miller, 1989, CABIOS AilUl). 

Figures SOA-SOE depict a global alignment between the nucleotide sequence of the 
open reading frame (ORF) of human TANGO 272 (SEQ ID NO: 15) and the nucleotide 
sequence of the open reading frame of mouse TANGO 272 (SEQ ID NO:18). The upper 
sequence is the mouse TANGO 272 ORF nucleotide sequence, while the lower sequence is 
the human TANGO 272 ORF nucleotide sequence. These nucleotides sequences share a 
39. 1% identity. The global alignment was p^formed using the ALIGN program version 
2.0u (Matrix file used: pam 120.mat, gap penalties of -12A4 with a global alignment score 
of -79; Myers and Miller, 1989, CABIOS 
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Figures 31A-31D depict a local alignment between the nucleotide sequence of 
human TANGO 272 (SEQ ID N0:13) and the nucleotide sequence of mouse TANGO 272 
(SEQ ID NO: 1 6)» The upper sequence is the human TANGO 272 nucleotide sequence, 
while the lower sequence is the mouse TANGO 272 nucleotide sequence. These 
nucleotides sequences share a 67.6 % identity over nucleotide 1890 to nucleotide 4610 of 
^ the human TANGO 272 sequence (nucleotide 10 to nucleotide 2560 of mouse TANGO 
272). The local alignment was performed using the L-ALIGN program version 2.0u54 July 
1996 (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a score of 8462; Huang 
andMiller, 199\, Adv. Appl Math. 12:373-381). 

Figures 32A-32B depict a global alignment between the amino acid sequence of 
human TANGO 272 (SEQ ID NO: 14) and the amino acid sequence of mouse TANGO 272 
(SEQ ID NO: 17). The upper sequence is the human TANGO 272 amino acid sequence, 
while the lower sequence is the mouse TANGO 272 amino acid sequence. These amino 
acid sequences share a 38.2% identity. The global alignment was performed using the 
ALIGN program version 2.0u (Matrix file used: pam 120*mat, gap penalties of -12/-4 with a 
global alignment score of -19; Myers and Miller, 1989, CABIOS 4:11-7). 

Fzgwre^s 55^-551) depict the cDNA sequence of rat TANGO 272 (SEQ ID NO:19) 
and the predicted amino acid sequence of TANGO 272 (SEQ ID NO:20)* The open reading 
fi-ame of SEQ ID NO: 19 extends firom nucleotide 925 to nucleotide 2832 of SEQ ID NO: 19 
(SEQIDNO:21). 

20 

Figures 34A-34H depict a global alignment between the nucleotide sequence of 
human TANGO 272 (SEQ ID NO: 13) and the nucleotide sequence of rat TANGO 272 
(SEQ ID NO: 19). The tipper sequence is the human TANGO 272 nucleotide sequence, 
while the lower sequence is the rat TANGO 272 nucleotide sequence. These nucleotides 
sequences share a 55.7% identity. The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of 8635; Myers and Miller, 1989, CABIOS 4:1 1-7). 

Figures 35A-35F depict a global aligmnent between the nucleotide sequence of 
mouse TANGO 272 (SEQ ID N0 :16) and the nucleotide sequence of rat TANGO 272 
(SEQ ID NO:19). The upper sequence is the mouse TANGO 272 nucleotide sequence, 

OA 

while the lower sequence is the rat TANGO 272 nucleotide sequence. These nucleotides 
sequences share a 43.7% identity. The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of 2827; Myers and Miller, 1989, CABIOS 

Figure 36 depicts a global alignment of the human TANGO 295 and GenPept 
AF037081 amino acid sequences. The upper sequence is the human TANGO 295 sequence 
(SEQ ID NO:23), while the lower sequence is the GenPept AF037081 sequence (SEQ ID 
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NO: 100). GenPeptAF037081 encodes a ribonuclease k6 protein. The global alignment 
revealed a 53.2% identity between these two sequences (Matrix j51e used: pam 120.mat, gap 
penalties of -12/-4 with a global alignment score of 405; Myers and Miller, 1989, CABIOS 
4:11-7). 

Figures SJA-STC depict a global alignment of the human TANGO 295 (SEQ ID 
NO:22) and GenPept AF037081 (SEQ ID NO:100) nucleotide sequences. The upper 
sequence is the human TANGO 295 sequence, while the lower sequence is the GenPept 
AF037081 sequence. The global alignment revealed a 22.6% identity between these two 
sequences (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global alignment 
score of -2718; Myers and Miller, 1989, CABIOS 4:11-7). 

Figures 38A-38B depict a local alignment of the human TANGO 295 (SEQ ID 
NO:22) and GenPept AF037081 (SEQ ID NO: 100) nucleotide sequences. The upper 
sequence is the human TANGO 295 sequence, while the lower sequence is the GenPept 
AF037081 sequence. The local alignment revealed a 62.7% identity between nucleotide 
235 to nucleotide 687 of human TANGO 295, and nucleotide 3 to nucleotide 453 of 
AF037081; 43.4% identity between nucleotide 410 to nucleotide 850 of human TANGO ■ 
295, and nucleotide 3 to nucleotide 450 of AF037081; and 46.5% identity between 
nucleotide 432 to nucleotide 700 of human TANGO 295, and nucleotide 5 to nucleotide 251 
of AF037081 (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of 1214; Huang and Miller, 1991, Adv. Appl Math. 12:373-381). 

Figures 39A-39B dqpict an alignment of each of the EGF-like domains and laminin- 
EGF-like domains of mouse TANGO 272 with consensus hidden Markov model EGF-like 
domains. For alignments of the EGF-like domains, the upper sequence is the consensus 
amino acid sequence (SEQ ID NO:46), while the lower sequence corresponds to amino 
acids 37-67 of SEQ ID N0:17 (SEQ ID NO:64); amino acid 80 to amino acid 1 10 of SEQ 
ID N0;17 (SEQ ID NO:65); amino acid 123 to amino acid 153 of SEQ ID NO:17 (SEQ ID 
NO:66); and amino acid 166 to amino acid 196 of SEQ ID N0;17 (SEQ ID NO:67). For 
alignments of the laminin/EGF-like domains, the upper sequence is the consensus hidden 
Markov model domain (SEQ ID NO:48), while the lower sequence corresponds to amino 
acid 3 to amino acid 37 of SEQ ID N0:17 (SEQ ID NO:68); amino acid 41 to amino acid 
80 of SEQ ID NO:17 (SEQ ID NO:69); amino acid 83 to amino acid 123 of SEQ ID NO:17 
(SEQ ID NO:70); and amino acid 127 to amino acid 172 of SEQ ID NO:17 (SEQ ID 
N0:71). For alignment of the delta serrate hgand domain, the upper sequence is the 
consensus hidden Markov model domain (SEQ ID NO:47), while the lower sequence 
corresponds to amino acid 10 to amino acid 67 of SEQ ID N0:17 (SEQ ID NO:72). 

Figure 40 depicts a hydropathy plot of rat TANGO 272. Relatively hydrophobic 
residues are above the dashed horizontal line, and relatively hydrophilic residues are below 
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the dashed horizontal line. The cysteine residues (cys) and potential N-glycosylation sites 
(Ngly) are indicated by short vertical lines just below the hydropathy trace. Below the 
hydropathy plot, the numbers corresponding to the amino acid sequence of rat TANGO 272 
are indicated. 

Figures 41A'41D depict an ahgnment of each of the EGF-like domains and laminin- 
EGF-like domains of rat TANGO 272 with consensus hidden Markov model of EGF-like 
domains. For alignments of the EGF-like domains, the upper sequence is the consensus 
amino acid sequence (SEQ ID NO:46), while the lower sequence corresponds to amino acid 
18 to amino acid 48 of SEQ ID NO:20 (SEQ ED NO:73); amino acid 61 to amino acid 91 of 
SEQ ID NO:20 (SEQ ID NO:74); amino acids 105437 of SEQ ID NO:20 (SEQ ID 
NO:75); amino acids 150-180 of SEQ ID NO:20 (SEQ ID NO:76); amino acids 193-223 of 
SEQ ID NO:20 (SEQ ID NO:77); amino acids 236-266 of SEQ ID NO:20 (SEQ ID 
NO:78); amino acids 279-309 of SEQ ID NO:20 (SEQ ID NO:79); amino acids 322-352 of 
SEQ m NO:20 (SEQ ID NO:80); amino acids 365-394 of SEQ ID NO:20 (SEQ ID 
N0:81); amino acids 407-437 of SEQ ID NO:20 (SEQ ID NO:82); and amino acids 450- 
480 of SEQ ID NO:20 (SEQ ID N0:83). For alignments of the laminin/EGF-like domains, 
the upper sequence is the consensus hidden Markov model domain (SEQ ID NO:48), while 
the lower sequence corresponds to amino acids 22-61 of SEQ ID NO:20 (SEQ ID NO;84); 
amino acids 65-105 of SEQ ID NO:20 (SEQ ID NO:85); amino acids 109450 of SEQ ID 
NO:20 (SEQ BD NO:86); amino acids 154-193 of SEQ ID NO:20 (SEQ ID NO:87); amino 
acids 197-236 of SEQ ID NO:20 (SEQ ED NO:88); amino acids 240-279 of SEQ ID NO:20 
(SEQ ID NO:89); amino acids 283-322 of SEQ ID NO:20 (SEQ ID NO:90); amino acids 
326-365 of SEQ ID NO:20 (SEQ ID N0:91); amino acids 368-407 of SEQ ID NO:20 (SEQ ■ 
ID NO:92); amino acids 411-450 of SEQ ID NO:20 (SEQ ID NO:93); and amino acids 454- 
489 of SEQ ID NO:20 (SEQ ID NO:94). For alignment of the delta serrate Ugand domain, 
the upper sequence is the consensus hidden Markov model domain (SEQ ID NO:47), while 
the lower sequence corresponds to amino acids 246-309 of SEQ ID NO:20 (SEQ ID 
NO:95). 



Detailed Description of the Invention 

The present invention is based, at least in part, on the discovery of cDNA molecules 
encodmg INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378, all of which are either wholly secreted or transmembrane 
proteins. 

The proteins and nucleic acid molecules of the present invention comprise a family 
of molecules having certain conserved structural and functional features. As used herein, 
the term "family" is intended to mean two or more proteins or nucleic acid molecules 
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having a common structural domain and having sufficient amino acid or nucleotide 
sequence identity as defined herein. Family members can be from either the same or 
different species. For example, a family can comprise two or more proteins of human 
origin, or can comprise one or more proteins of human origin and one or more of non- 
human origin. Members of the same family may also have common structural domains. 
^ For example, INTERCEPT 340 family members can include at least one, preferably 

two, and more preferably three fibrillar collagen C-terminal domains (also referred to herein 
as "COLF domams"). As used herein, a "fibrillar collagen C-terminal domain" refers to an 
amino acid sequence of about 15 to 65, preferably about 20-60, more preferably about 25, 
31-58 amino acids in length. Consensus hidden Markov model COLF domains contain the 
sequence of SEQ ID N0s:31, 32, and 33 (Figure 3). The more conserved residues in the 
consensus sequence are indicated by uppercase letters and the less conserved residues in the 
consensus sequence are mdicated by lowercase letters. A comparison of the C-tenninal 
sequences of fibrillar collagens, coUagens X, Vin, and the collagen C 1 q revealed a 
conserved cluster of amino acid residues having aromatic side chains (e.g., tyrosuie, 

1 ^ 

phenylalanine, tryptophan, histidine) that exhibited marked similarities m hydrophilicity 
profiles between the different collagens, despite a low level of sequence similarity. These 
similarities in hydrophiUcity profiles within their C-termini suggest that these proteins may 
adopt a common tertiary structure and that the conserved cluster of aromatic residues in this 
domain may be involved in C-terminal trimerization. The COLF domains of INTERCEPT 
2° 340 extend fi:om about amino acids 58 to 116, 126 to 151, and 186 to 217 of SEQ ID N0:2 
(SEQ ID NOs:34, 35, and 36, respectively) (Figure 3). By alignment of the amino acid 
sequence of the consensus hidden Markov model COLF amino acid sequence with the 
amino acid sequence of tlie COLF domains of INTERCEPT 340, conserved amino acid 
residues having aromatic side chains can be found. For example, conserved tyrosine, 
tryptophan and phenylalanine residues can be found at amino acid 87, 88 and 133 of SEQ 
IDN0:2. 

MANGO 003 and TANGO 354 family members can include at least one, preferably . 
two, and more preferably three immunoglobulin domains. As used herein, an 
"immunoglobulin domain" (also referred to herein as "Ig") refers to an amino acid sequence 
of about 45 to 85, preferably about 55-80, more preferably about 57, 58, or 78, 79 amino 
acids in length. Preferably, the immunoglobulin domains have a bit score for the aHgnment 
of the sequence to the Ig family Hidden Markov Model (HMM) of at least 10, preferably 
20-30, more preferably 22-40, more preferably 40-50, 50-75, 75-100, 100-200 or greater. 
The Ig family HMM has been assigned the PFAM Accession PF00047. Consensus hidden 
Markov model immunoglobulin domains are shown Figures 6 and 23 (SEQ ID NO:37). 
The more conserved residues in this consensus sequence are indicated by uppercase letters 
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and the less conserved residues in the consensus sequence are indicated by lowercase 
letters. Itnmmioglobulin domains are present in a variety of proteins (including secreted 
and membrane-associated proteins). Membrane-associated proteins may be involved in 
protein-protein, and protein-ligand interaction at the cell surface, and thus may influence 
diverse activities including cell surface recognition and/or signal transduction* The 
immunoglobulin domains of MANGO 003 extend from about amino acids 44 to 101, 165 to 
223, and 261 to 240 of SEQ ID N0:5 (SEQ ID NOs:38, 39, and 40, respectively) (Figure 
6). The immunoglobulin domain of TANGO 354 extend from about amino acids 33 to 1 10 
of SEQ ID NO:26 (SEQ ID NO:41) (Figure 23). 

MANGO 003 family member can include a neurotransmitter-gated ion channel 
domain. As used herein, a "neurotransmitter-gated ion channel domain" refers to an amino 
acid sequence of about 5 to 20, preferably about 7 to 12, more preferably about 9 to 10 
amino acids in length. The neurotransmitter-gated ion channel domain HMM has been 
assigned the PFAM Accession PF00065* A consensus hidden Markov model 
neurotransmitter-gated ion channel domain contain the sequence of SEQ ID NO:42 shown 
in Figxire 7. The more conserved residues in the consensus sequence are indicated by 
uppercase letters and the less conserved residues in the consensus sequence are indicated by 
lov^ercase letters. The neurotransmitter-gated ion channel domains of MANGO 003 extend 
from about amino acids 388 to 397 of SEQ ID N0:5 (SEQ ID NO:43). 

TANGO 272 family members can include at least one, two, three, four, five, six, 
seven, eight, nine, ten, eleveUj twelve, preferably thirteen, and more preferably fourteen 
EGF-like domains. Preferably, the EGF-like domains are found in the extracellular domain 
of a TANGO 272 protein. As used herein, an "EGF-lilce domain*' refers to an amino acid 
sequence of about 25 to 50, preferably about 30 to 45, and more preferably 30 to 40 amino • 
acid residues in length. An EGF domain further contains at least about 2 to 10, preferably, 
3 to 9, 4 to 8, or 6 to 7 conserved cysteine residues. A consensus hidden Markov model 
EGF-like domain sequence includes six cysteines, all of which are thought to be involved in 
disulfide bonds having the following amino acid sequence: Cys-Xaa(5, 7)-Cys-Xaa(4, 5, 
12)-Cys-Xaa(l, 5, 6)-Cys-Xaa(l)-Cys-Xaa(l)- Cys-Xaa(8)-Cys (SEQ ID NO:46), where 
Xaa is any amino acid. The region between the fifth and the sixth cysteine typically 
contains two conserved glycines of which at least one is present in most EGF-like domains. 

In one embodiment, TANGO 272 includes at least one EGF-Uke domain having the 
sequences selected from the group consisting of: amino acids 151-181 of SEQ ID N0:14 
(SEQ ID NO:49); amino acids 200-229 of SEQ ID NO: 14 (SEQ ID NO:50); amino acids 
242-272 of SEQ ED N0:14 (SEQ ID NO:51); amino acids 285-315 of SEQ ID NO:14 (SEQ 
ID NO:52); amino acids 328-358 of SEQ ID N0:14 (SEQ ID NO:53); amino acids 378-404 
of SEQ ID N0:14 (SEQ ID NO:54); amino acids 417-447 of SEQ ID N0:14 (SEQ ID 
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NO;55); amino acids 460-490 of SEQ ID N0:14 (SEQ ID.NO:56); amino acids 503-533 of 
SEQ ID N0:14 (SEQ ID NO:57); amino acids 546-576 of SEQ ID NO:14 (SEQ ID 
NO:58); amino acids 589-619 of SEQ ID N0:14 (SEQ ID NO:59); amino acids 632-661 of 
SEQ ID NO: 14 (SEQ ID NO:60); amino acids 674-704 of SEQ ID NO: 14 (SEQ ID 
N0:61); and amino acids 717-747 of SEQ ID N0:14 (SEQ ID NO:62). 

In anotha- embodimesat, TANGO 272 includes at least one EGF-like domain having 
the sequences selected from the group consisting of: 37-67 of SEQ ID NO:17 (SEQ ID 
NO:64); amino acids 80-110 of SEQ ID N0:17 (SEQ ID NO:65); amino acids 123-153 of 
SEQ ID N0:17 (SEQ ID NO:66); and amino acids 166-196 of SEQ ID NO:17 (SEQ ID 
N0:67). 

In yet another embodiment, TANGO 272 includes at least one EGF-like domain 
having the sequences selected from the group consisting of: amino acids 1 8-48 of SEQ ID 
NO:20 (SEQ ID NO:73); amino acids 61-91 of SEQ ID NO:20 (SEQ ID NO:74); amino 
acids 105-137 of SEQ ED NO:20 (SEQ ID NO:75); amino acids 150-180 of SEQ ID NO:20 
(SEQ ID NO:76); amino acids 193-223 of SEQ ID NO:20 (SEQ ED NO:77); ammo acids 
236-266 of SEQ ID NO:20 (SEQ ID NO:78); amino acids 279-309 of SEQ ID NO:20 (SEQ 
ID NO:79); amino acids 322-352 of SEQ ED NO:20 (SEQ ID NO:80); amino acids 365-394 
of SEQ ID NO:20 (SEQ ID NO:81); amino acids 407-437 of SEQ ED NO:20 (SEQ ID 
NO:82); and amino acids 450-480 of SEQ ID NO:20 (SEQ ED NO:83). 

An aligmnent of the consensus hidden Maikov model BGF-like domains with the 
EGF-like domains of human TANGO 272 is shown in Figures 15A-15C. The more 
conserved residues in the consensus sequence are indicated by i^percase letters and the less 
cons^ed residues in the consensus sequence are indicated by lowercase letters. By 
alignment of the amino acid sequence of the consensus hidden Markov model EGF-like 
domain with the amino acid sequence of the EGF-like domains of TANGO 272, conserved 
cysteine residues can be found. For example, conserved cysteine residues can be foimd at 
ammo acid 151, 159, 164, 167, 200, 206, 211, 218, 220, 229, 242, 249, 263, 264, 272, 285, 
291, 297, 304, 306, 315, 328, 334, 340, 347, 349, 358, 378, 386, 393, 395, 404, 417, 423, 
429, 436, 438, 447, 460, 466, 472, 479, 481, 490, 503, 509, 515, 522, 524, 533, 546, 552, 
558, 565, 567, 576, 589, 595, 601, 608, 610, 619, 632, 637, 643, 650, 652, 661, 674, 680, 
686, 693, 695, 717, 723, 729, 736, 738 and 747 of SEQ ED NO: 14. 

TANGO 272 family members can include at least one delta serrate ligand domain. 
As used herein, a "delta serrate ligand domain" (also referred to herein as a "DSL domain") 
refers to an amino acid sequence of about 30-70, more preferably 45-60, and most 
preferably 58 amino acids in length typically found in transmembrane signaling molecules 
tiiat regulate dififereaatiation in metazoans (Lissemore et al., 1999, Mol. Phylogenet. Evol. 
1 1(2):308-19). In one embodiment, human TANGO 272 includes a delta serrate ligand 
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domain from about amino acids 518 to 576 of SEQ ID N0:14 (SEQ ED NO:63); and about 
amino acids 246 to 309 of SEQ ID NO:20 (SEQ ID NO:95), Figure 15B depicts an 
aligmnent of the consensus hidden Markov model delta serrate ligand domain (SEQ ID 
NO:47) with this domain in human TANGO 272 at amino acids 5 1 8 to 576 of SEQ ID 
NO: 14 (SEQ ID NO:63). Figures 39A-39B depict an aligmnent of the consensus hidden 

^ Markov model delta serrate ligand domain (SEQ ID NO:47) with this domain in mouse 
TANGO 272 at amino acids 10 to 67 of SEQ ID N0:17 (SEQ ID NO:72). Figures 41A- 
4 IB depict an aUgnment of the consensus hidden Markov model delta serrate ligand domain 
(SEQ ID NO:47) with this domain in rat TANGO 272 at amino acids 246 to 309 of SEQ ID 
NO:20(SEQIDNO:95). 

TANGO 272 family members can include at least one RGD cell attachment site. As 
used herein, the term "RGD cell attachment site" refers to a cell adhesion sequence 
consisting of amino acids Arg-Gly-Asp typically found in extracellular matrix proteins such 
as coUagens, laminin and fibronectin, among others (reviewed in Ruoslahti, 1996, Annu. 
Rev. Cell Dev. Biol 12:697-715). Preferably, the RGD cell attachment site is located in the 
extracellular domain of a.TANGO 272 protein and interact$ (e.g., binds to) a cell surface 
receptor, such as an integrin receptor. As used herein, the term "integrih" refers to a family 
of receptors comprising a/p heterodimers that mediate cell attachment to extracellular 
matrices and cell-cell adhesion events. The a subunits vary in size between 120 and 180 
kDa and are each noncovalently associated with a P subimit (90-1 10 kDa) (reviewed by 
Hynes, 1992^ Cell 69: 1 1-25). Most integrins are expressed in a wide variety of cells, and 
most cells express several integrins. There are at least 8 known a subunits and 14 known p 
subunits. The majority ofthe integrin ligands are extracellular matrix proteins involved in 
substratum cell adhesion such as coUagens, laminin, fibronectin among others. The RGD 
cell attachment site is located at about amino acid residues 177-179 of SEQ ID NO: 14. 

MANGO 347 family members can include a CUB domain sequence. As used 
herein, the term "CUB domain" includes an ammo acid sequence having at least about 80- 
150, preferably 90-130, more preferably 96-120, and most preferably about 1 10 amino acids 
in length. Preferably, a CUB domain further includes at least one, preferably two, three, 
and most preferably four conserved cysteine residues. Preferably, the conserved cysteine 
residues form at least one, and preferably two disulfide bridges (^.g., Cysl-Cys2, and Cys3- 
Cys4) resulting in a p-barrel configuration. The CUB domain of MANGO 347 extends 
from about amino acid 40 to amino acid 136 of SEQ ID N0:1 1 (SEQ ID NO;45). Figure 12 
depicts an alignment of the consensus hidden Markov model CUB domain (SEQ ID N0:44) 
with this domain m human MANGO 347 at amino acids 40 to 136 of SEQ ID NO:l 1 (SEQ 

3^ IDNO:45). 
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TANGO 295 family members can include a pancreatic ribonuclease domain 
sequence. As used herein, the testm "pancreatic ribonuclease domain" includes an amino 
acid sequence having at least about 100 to 150, preferably 1 10-140, more preferably 120- 
130, and most preferably 124 amino acids in length. Preferably, a pancreatic ribonuclease 
domain further includes at least one, preferably two, three, four and most preferably five 
^ conserved cysteine residues and an amino acid residue, e.g., a lysine, which is involved in 
catalytic activity. Preferably, at least one cysteine residue is involved in a disulfide bond, a 
lysine residue is involved in catalytic activity, and three other residues involved in substrate 
binding. Proteins having the pancreatic ribonuclease domain are pyrimidine-specific 
endonucleases present in high quantities in the pancreas of a number of mammalian taxa 
and of a few reptiles, The pancreatic ribonuclease domain of TANGO 295 extends firom 
about amino acid 32 to amino acid 1 56 of SEQ ID NO:23 (SEQ ID NO:97). Figure 20 
depicts an alignment of the consensus hidden Markov model pancreatic ribonuclease 
domain (SEQ ID NO:96) with this domain in human TANGO 295 at amino acids 32 to 156 
of SEQ ID NO:23 (SEQ ID NO:97). 

Based on structural similarities, TANGO 378 family members can be classified as 
members of the superfamily of G-protein coiapled receptor. As used herein, the term "G 
protein-coupled receptor" or "GPCR" refers to a family of proteins that preferably comprise 
an N-terminal extracellular domain, seven transmembrane domains (also referred to as 
membrane-spaxming domains), three extracellular domains (also referred to as extracellular 

20 

loops), three cytoplasmic domains (also referred to as cytoplasmic loops), and a C-terminal 
cytoplasmic domain (also referred to as a cytoplasmic tail). Members of the GPCR family 
also share certain conserved amino acid residues, some of which have been determined to 
be critical to receptor fimction and^or G protein signaling. An aligrmient of the 
transmembrane domains of 44 representative GPCRs can be found at 
http://mgdlckLnidlLnih.gov: 8000/extended.html. 

Accordingly, in one embodiment, TANGO 378 family members can include at least 
one, two, tliree, four, five, $ix, or preferably, seven transmernbrane domains, and thus has a 
*7 transmembrane receptor profile". As used herein, the term "7 transmembrane receptor 
profile" mcludes an amino acid sequence having at least about 10-300, preferably about 15- 
200, more preferably about 20-100 amino acid residues, or at least about 22-100 amino 
acids in length and having a bit score for the alignraent of the sequence to the 7tm_l family 
Hidden Markov Model (HMM) of at least 10, preferably 20-30, more preferably 22-40, 
more preferably 40-50, 50-75, 75-100, 100-200 or greater. The 7tm_l family HMM has 
been assigned the PFAM Accession PFOOOOl 

Ohittp://g^ome. wustl.edu/Pfam/WWWdata/7tm_Lhtnil). In one embodiment, the seven 
transmembrane domains of TANGO 378 extend from about amino acids 245 to about 
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amino add 269 of SEQ ID NO:29 (SEQ ID NO: 135), about amino acids 287 to about 
amino acid 306 of SEQ ID NO:29 (SEQ ID NO:136), about amino acids 323 to about 
amino acid 343 of SEQ ID NO:29 (SEQ ID NO: 137), about amino acids 358 to about 
amino acid 376 of SEQ ID NO:29 (SEQ ID NO:138), about amino acids 414 to about 
amino acid 438 of SEQ ID NO:29 (SEQ ID NO:139X about amino acids 457 to about 
^ amino acid 477 of SEQ ID NO:29 (SEQ ID NO: 140), and about amino acids 485 to about 
amino acid 504 of SEQ ID NO:29 (SEQ ID NO: 141); and a C-terminal cytoplasmic domain 
which extends j&om about amino acid 505 to amino acid 528 of SEQ ID NO:29 (SEQ ID 
NO: 142). Figmre 26 depicts an alignment of each of the transmembrane domains of 
TANGO 378 with the consensus hidden Markov model seven transmembrane receptor 
domain (SEQ ID NO:98> 

To identify the presence of a 7 transmembrane receptor profile in a TANGO 378, the 
amino acid sequence of the protein is searched against a database of HMMs (e.g., the Pfam 
database, release 2.1) using the default parameters 

(http://www.sanger.ac.nk/Software/Pfam/H3^ For example, the hmmsf program, 

which is available as part of the HMMER package of search programs, is a family specific 
default program for PFOOOOl and score of 1 5 is the default threshold score for determining 
a hit. Alternatively, the seven transmembrane domain can be predicted based on stretches 
of hydrophobic amino acids forming a-helices (SOUSI server). Accordingly, proteins 
having at least 50-60% identity, preferably about 60-70%, more preferably about 70-80%, 

20 

or about 80-90% identity with the 7 transmembrane receptor profile of human TANGO 378 
are within the scope of the invention. 

TANGO 378 family members can include at least one, preferably two, and most 
preferably three extracellular loops. As defined herein, I3ie term *%op*' includes an amino 
acid sequence having a length of at least about 4, preferably about 5-10, preferably about 
10-20, and more preferably about 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, 90-100, 
or 100-150 amino acid residues, and has an amino acid sequence that connects two 
transmembrane domains within a protein or polypeptide. Accordingly, the N-terminal 
amino acid of a loop is adjacent to a C-terminal amino acid of a transmembrane domain in a 
naturally-occurring TANGO 378 or TANGO 378-Hke molecule, and the C-terminal amino 
acid of a loop is adjacent to an N-terminal amino acid of a transmembrane domain in a 
naturally-occurring TANGO 378 or TANGO 378-Hke molecule. As used herein, an 
"extracellular loop" includes an amino acid sequence located outside of a cell, or 
extracellularly. For example, an extracellular loop can be found at about amino acids 307- 
322, 377-413, and 478-484 of SEQ ID NO:29. 

TANGO 378 family members can include at least one, preferably two, and most 
preferably three cytoplasmic loops. As used herein, a "cytoplasmic loop" includes an amino 
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acid sequence located within a cell or within the cytoplasm of a cell For example, a 
cytoplasmic loop is found at about amino acids 270-286, 344-357, and 439-456 of SEQ ID 
NO:29. 

Iq one embodiment, a MANGO 003, a TANGO 272, a TANGO 354 or a TANGO 
378 family member can include one or more of the following domains: (1) an N-terminal 
extracellular domain, (2) a transmembrane domain, or (3) a C-terminal cytoplasmic domain. 

MANGO 003, a TANGO 272, a TANGO 354 or a TANGO 378 family member can 
include an extracellular domain. When located at the N-terminal domain the extracellular 
domain is referred to herein as an "N-terminal extracellular domain" or an "extracellular 
domain". As used herein, an "N-terminal extracellular domain" includes an amino acid 
sequence havmg about 1-800, preferably about 1-746, more preferably about 1-650, more 
preferably about 1-550, more preferably about 1-369, about 150 amino acid residues in 
length and is located outside of a cell or extracellularly. The C-terminal amino acid residue 
of a "N-terminal extracellular domain" is adjacent to an N-terminal amino acid residue of a 
transmembrane domain in a naturally-occurring MANGO 003, TANGO 272, TANGO 354 
or TANGO 378 protein. Preferably, the N-terminal extracellular domain is capable of 
interacting (e.g., binding to) with an extracellular signal, for example, a ligand a 
glycoprotein hormone) or a cell surface receptor (e.g., an integrin receptor). Most 
preferably, the N-terminal extracellular domain mediates a variety of biological processes, 
for example, protein-protein interactions, signal transduction and/or cell adhesion. In one 
embodiment, an N-terminal cytoplasmic domain is located at about amino acids 25-374 of 
SEQ ID N0:5 (SEQ ED NO:103); about amino acids 1-73 of SEQ ID N0:8 (SEQ ID 
NO: 1 07); at about amino acids 21-767 of SEQ ID NO: 14 (SEQ ID NO: 1 14); at about amino 
acids 1-216 of SEQ ID NO:17 (SEQ ID NO: 118); at about amino acids 1-500 of SEQ ID 
N0:20 (SEQ ID NO: 122); at about amino acids 20-169 of SEQ ID NO:26 (SEQ ID 
NO:129); and at about amino acids 22-244 of SEQ ID NO:29 (SEQ ED NO:134). 

In another embodiment, a MANGO 003, a TANGO 272, a TANGO 354 or a 
TANGO 378 family member can include a transmembrane domain. As used herein, the 
term "transmembrane domam" includes an amino acid sequence of about 15 amino acid 
residues in length which spans the plasma membrane. More preferably, a transmembrane 
domain includes about at least 20, 25, 30, 35, 40, or 45 amino acid residues and spans the 
plasma membrane. Transmembrane domains are rich in hydrophobic residues, and 
typically have an os-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 
80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, 
e.g.^ leucines, isoleucines, tyrosines, or tryptophans. Transmembrane domains are 
described in, for example, http://pfam.wustl.edu/cgi-bin/getdesc?name=7tm-l and Zagotta 
et al, 1996, Annual Rev. NeuronscL 19: 235-63, the contents of which are incorporated 
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herein by reference. Amino acid residues 375-398 of SEQ ID NO:5 (SEQ ID NO:104), 74- 
96 of SEQ ID NO:8 (SEQ ID NOrlOS), 768-791 of SEQ ID N0:14 (SEQ ID NO:l 15), 217- 
240 of SEQ ID N0:I7 (SEQ ID NO:119), 501-524 of SEQ ID NO:20 (SEQ ID NO:123); 
170-193 of SEQ ID NO:26 (SEQ ID NO:130), and 245-269, 287-306, 323-343, 358-376, 
414-438, 457-477 and 485-504 of SEQ ID NO:29 (SEQ ID NOs:135-141) include 

^ transmembrane domains. 

A MANGO 003, TANGO 272, TANGO 354 or TANGO 378 family member can 
ittclude a C-teriiunal C3^oplasniic domain. As used herein, a "C-terminal cj^oplasmic 
domain" includes an amino acid sequence having a length of at least about 10, preferably 
about 10-25, more preferably about 25-50, more preferably about 50-75, even more 
preferably about 75-100, 100-133, 133-150, 150-200, 200-250, 250-300, 300-400, 400-500, 
or 500-600 amino acid residues and is located within a cell or within the c3^oplasm of a 
cell. Accordinglyj the N-terminal amino acid residue of a "C-terminal cytoplasmic domain" 
is adjacent to a C-terminal amino acid residue of a transmembrane domain in a naturally- 
occurring MANGO 003, TANGO 272, TANGO 354 or TANGO 378 protein. For example, 

•^^ a C-tenninal cytoplasmic domain is found at about amino acid residues 399-504 of SEQ ID 
N0:5, 97-208 of SEQ ID NO:8, 792-1050 of SEQ ID NO:14, 241-497 of SEQ ID N0:17, 
525-636 of SEQ ID NO:20; 194-305 of SEQ ID NO:26, and 505-528 of SEQ ID NO:29. 

MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, or TANGO 
378 family members can include a signal peptide. As used herein, a "signal peptide" 
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mcludes a peptide of at least about 15 amino acid residues in length which occurs at the N- 
terminus of secretory and membrane-bound proteins and which contains at least about 70% 
hydrophobic amino acid residues such as alanine, leucine, isoleucine, phenylalanine, 
proline, tyrosine, tryptophan, or valine. The sequence can contain about 15 to 45 amino 
acid residues or about 17-22 amino acid residues, and has at least about 60-80%, 65-75%, or 
about 70% hydrophobic residues. A signal peptide serves to direct a protein containing 
such a sequence to a lipid bilayer. Thus, in one embodiment, a MANGO 003 protein 
contains a signal peptide of about amino acids 1-22, 1-23, 1-24, 1-25, or 1-26 of SEQ ID 
N0:5 (SEQ ID NOrlOl). In one embodiment, a MANGO 347 protein contains a signal 
peptide of about amino acids 1-33, 1-34, 1-35, 1-36, or 1-37 of SEQ ID N0:11 (SEQ ID 
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NO: 110). In one embodiment, a TANGO 272 protein contains a signal peptide of amino 
acids 1-18, 1-19, 1-20, 1-21, or 1-22 of SEQ ID NO: 14 (SEQ ID N0:1 12). In yet another 
embodiment, a TANGO 295 protein contains a signal peptide of amino acids 1-26, 1-27, 1- 
28, 1-29, or 1-30 of SEQ ID NO:23 (SEQ ID NO:125). In another embodiment, a TANGO 
354 protein contains a signal peptide of amino acids 1-17, 1-18, 1-19, 1-20, or 1-21 of SEQ 
IDNO:26(SEQIDNO:127). In another embodiment, a TANGO 378 protein contains a 
signal peptide of amino acids 1-19, 1-20, 1-21, 1-22, or 1-23 of SEQ ID NO:29 (SEQ ID 

-25- 



BNSDOCID: <WO 0100673A1JB> 



NO: 132). llie signal peptide is cleaved durmg processing of th^ The 
amino acid sequence of the mature MANGO 003, MANGO 347, TANGO 272, TANGO 
295, TANGO 354, or TANGO 378 protein starts at the next amino acid after the signal 
peptide is cleaved. For example, the amino acid sequence of MANGO 003 may start at 
amino acids 23, 24, 25, 26, or 27 depending on the exact location of the cleavage of the 
^ signal peptide. 

The signal peptide is cleaved during processing of tihe mature protem. Sometimes 
the initial methionine residue is also cleaved from the protein during signal peptide 
processing. Thus, in one embodiment, a MANGO 003 protein does not contain a signal 
peptide or an initial methionine residue and begins from residue 2 of SEQ ID NO:'l02. In 
^ ^ one embodiment, a MANGO 347 protein does not contain a signal peptide or an initial 
methionine residue and begins from residue 2 of SEQ ID NO: 1 1 1. In one embodiment, a 
TANGO 272 protein does not contain a signal peptide or an initial methionine residue and 
begins from residue 2 of SEQ ID NO: 113. Thus, in one embodiment, aTANG0 295 
protein does not contain a signal peptide or an initial methionine residue an begins from 
residue2of SEQIDNO:126. Thus, in one embodhnent, a TANGO 354 protdn does not 
contain a signal peptide or an initial methionine residue an begins from residue 2 of SEQ ID 
NO: 128. Thus, in one embodiment, a TANGO 378 protein does not contain a signal 
peptide or an initial methionine residue an begins from residue 2 of SEQ ID NO: 133. 

In one embodiment, a MANGO 003 family member includes three immunoglobulin 
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domains and a neurotransmitter^g;ated ion channel domain. In another embodiment, a 
MANGO 003 family member includes three immunoglobulin domains, a neurotransmitter- 
gated ion channel domain and a transmembrane domain. In yet another embodiment, a 
MANGO 003 family member includes three immunoglobulin domains, a neurotransmitter- 
gated ion channel domain, a transmembrane domain and an N-terminal extracellular 
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domain. In another embodiment, a MANGO 003 family member includes three 
immunoglobxdin domains, a neurotransmitter-gated ion channel domain, a transmembrane 
domain, an N-termmal extracellular domain and a C-terminal cytoplasmic domain. In yet 
another embodiment, a MANGO 003 family member includes three immunoglobulin 
domains, a neurotransmitter-gated ion channel domain, a transmembrane domain, an N- 
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terminal extracellular domain, a C-terminal cytoplasmic domain, and a signal peptide. 

In one embodiment, a MANGO 354 family member includes at least one 
immunoglobulin domain and a transmembrane domain. In another embodiment, a 
MANGO 354 family member includes at least one immunoglobulin domain, a 
transmembrane domain and a signal peptide. 

In one embodiment, a TANGO 272 family member includes fourteen EGF-Iike 
domains and a delta serrate ligand domain. In another embodiment, a TANGO 272 family 
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member includes fourteen EGF-Uke domains, a delta serrate ligand domain and an RGD 
cell attachment site. In yet another embodiment, a TANGO 272 family member includes 
fourteen EGF-like domains, a delta serrate ligand domai.n, an RGD cell attachment site, and 
a transmembrane domain. In another embodiment, a TANGO 272 family member includes 
fourteen EGF-like domains, a delta serrate ligand domain, an RGD cell attachment site, a 
transmembrane domain, and an extracellular N-terminal domain* In another embodiment, a 
TANGO 272 family member includes fourteen EGF-like domains, a delta serrate ligand 
domain, an RGD cell attachment site, a transmembrane domain, an extracellular N-terminal 
domain and a C-terminal cytoplasmic domain. In another embodiment, a TANGO 272 
family member includes fourteen EGF-like domains, a delta serrate ligand domain, an RGD 
cell attachment site, a transmembrane domain, an extracellular N~terminal domain, a C- 
terminal c3^oplasmic domain, and a signal peptide. 

In one embodiment, a TANGO 378 family member includes a 7 transmembrane 
receptor profile and three extracellular loops. In another embodiment, a TANGO 378 
family member includes a 7 transmembrane receptor profile, three extracellular loops, and 
three cytoplasmic loops. In yet another embodiment, a TANGO 378 family member 
includes a 7 transmembrane receptor profile, three extracellular loops, three cj^oplasmic 
loops, and an extracellular N-terminal domain. In another embodiment, a TANGO 378 
family member includes a 7 transmembrane receptor profile, three extracellular loops, three 
cytoplasmic loops, an extracellular N-terminal domain, and a C-terminal cytoplasmic 
domain. In another embodiment, a TANGO 378 family member includes a 7 
transmembrane receptor profile, three extracellular loops, three cj^oplasmic loops, an 
extracellular N-terminal domain, a C-terminal cytoplasmic domain, and a signal peptide. 

Various features of INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, 
TANGO 295, TANGO 354, and TANGO 378 are summarized below. 

] g^ERCEPT?40 

A cDNA encoding INTERCEPT 340 was identified by analyzing the sequences of 
clones present in a human fetal spleen cDNA library. 

This analysis led to the identification of a clone, jthsa 102b 12, encoding full-length 
human INTERCEPT 340. The cDNA of this clone is 3284 nucleotides long (Figures lA- 
IB; SEQ ID N0:1). The 723 nucleotide open reading frame of this cDNA, nucleotides 
1222-1944 of SEQ ID NO:l (SEQ ID N0:3), encodes a 241 amino acid protein (Figures 
1A-1B;SEQIDN0:2). 

Human INTERCEPT 340 that has not been post-translationally modified is 
predicted to have a molecular weight of 27.2 kDa. 
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Human INTERCEPT 340 includes three fibrillar collagen C-terminal (COLF) 
domains at mmio acids 58-116 of SBQ ID NO:2 (SEQ ID NO:34); amino acids 126-151 of 
SEQ ID NO:2 (SEQ ID NO:35); and amino acids 186-217 of SEQ ID NO:2 (SBQ ID 
NO:36). Figure 3 depicts alignments of each of the COLF domains of human INTERCEPT 
340 with consensus hidden Markov model COLF domains (SEQ ID N0s:31, 32, and 33). 
^ In one embodiment, INTERCEPT 340 is a secreted protein, hi another embodiment, 
INTERCEPT 340 is a membrane-associated protein. 

An N-glycosylation site is present at amino acids 1 05-1 08 of SEQ ID N0:2. A 
glycosaminoaglycan attachment site is present at amino acids 161-164 of SEQ ID NO:2. 
Protein kinase C phosphorylation sites are present at amino acids 57-59, 152-154, and 227- 
229 of SEQ ID N0:2. A tyrosine kinase phosphorylation site is present at amino acids 81- 
87 of SBQ ID N0:2» Casein kinase II phosphorylation sites are present at amino acids 36- 
39, 120-123 and 181-184. N-myristylation sites are present at amino acids 109-114 and 
164-169 of SEQ ID N0:2. 

Clone jthsal 02bl2, which encodes human INTERCBPT 340, was deposited as a 
composite deposit having a designation BpI340 with the American Type Culture Collection 
(ATCC® 10801 University Boulevard, Manassas, VA 20110-2209) on June 18, 1999 and 
assigned Accession Number PTA-250. A description of the deposit conditions is set forth 
in the section entitled "Deposit of Clones'* below. This deposit will be maintained under 
the terms of the Budapest Treaty on the International Recognition of the Deposit of 
Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a 
convenience for those of skill in the art and is not an admission that a deposit is required 
under35U.S.C.§112. 

Figure 2 depicts a hydropathy plot ofhuman INTERCEPT 340. Relatively 
hydrophobic regions are above the horizontal line, and relatively hydrophilic regions are 
below the horizontal hne. The cysteine residues (cys) are indicated by short vertical lines 
just below the hydropathy trace» 

Use of INTERCEPT 340 Nucleic Acids. Polypeptides, and Modulators Thereof 

INTERCEPT 340 includes three fibrillar collagen C-terminal domains. Proteins 
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having such domams play a role in modulating connective tissue formation and/or 
maintenance, and thus can influence a wide variety of biological processes, including 
assembly into fibrils; strengthening and organization of the extracellular matrix; shaping of 
tissues and ceUs; modulation of cell migration; and/or modulation of signal transduction 
pathways. Because INTERCEPT 340 includes fibrillar collagen C-terminal domains, 
^ ^ INTERCEPT 340 polypeptides, nucleic acids, and modulators thereof can be used to treat 
connective tissue disorders, including a skin disorder and/or a skeletal disorder [e.g.^ Marfan 
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syndrome and osteogenesis imperfecta); cardiovascular disorders including 
hyperproliferative vascular diseases (e.g., hypertension, vascular restenosis and 
atherosclerosis)^ ischemia reperfiision injury, cardiac hypertrophy, coronary artery disease, 
myocardial infarction, arrhythmia, cardiomyopathies, and congestive heart failure); and/or 
hematopoietic disorders (e,g., myeloid disorders, lymphoid malignancies, T cell disorders). 

. As INTERCEPT 340 was originally found in a fetal spleen library, ESTTERCEPT 
340 nucleic acids, proteins, and modulators thereof can be used to modulate the function, 
survival^ morphology, migration, proliferation and/or differentiation of cells that foim the 
spleen, e.g., cells of the splenic connective tissue, e.g., splenic smooth muscle cells and/or 
endotheUal cells of the splenic blood vessels. INTERCEPT 340 nucleic acids, proteins, and 
modulators thereof can also be used to modulate the proliferation, differentiation, and/or 
function of cells that are processed, e.g., regenerated or phagocytized within the spleen, e.g., 
erythrocytes and/or B and T lymphocytes and macrophages. Thus INTERCEPT 340 
nucleic acids, proteins, and modulators thereof can be used to treat spleen, e.g., the fetal 
spleen, associated diseases and disorders. Examples of splenic diseases and disorders 
include e.g., splenic lymphoma and/or splenomegaly, and/or phagocytotic disorders, e.g., 
those inhibiting macrophage engulfinent of bacteria and viruses in the bloodstream. 

Further, in hglit of INTERCEPT 340*s presence in a human fetal spleen cDNA 
library, INTERCEPT 340 expression can be utilized as a marker for specific tissues (e.g., 
lymphoid tissues such as the spleen) and/or cells (e.g., splenic) in which INTERCEPT 340 
is expressed* INTERCEPT 340 nucleic acids can also be utilized for chromosomal 
mapping. 
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MANqooos 

A cDNA encoding human MANGO 003 was identified by analyzing the sequences 
of clones present in a human thyroid cDNA library. 

This analysis led to the identification of a clone, jthYa030d03, encoding full-length 
human MANGO 003. The cDNA of this clone is 3 169 nucleotides long (Figures 4A-4B; 
SEQ ID NO:4), The 1512 nucleotide open reading frame of this cDNA, nucleotide 57 to 
nucleotide 1568 of SEQ ID N0:4 (SEQ ID NO:6), encodes a 504 amino acid protem 
(Figures 4A-4B; SEQ ID N0:5), 

Human MANGO 003 that has not been post-translationally modified is predicted to 
have a molecular weight of 54,5 kDaprior to cleavage of its signal peptide (52*1 kDa after 
cleavage of its signal peptide). 
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The signal peptide prediction program SIGNALP (Nielsen et al,, 1991, Protein 
Engineering 10:1-6) predicted that hiiinan MANGO 003 includes a 24 amino acid signal 
peptide at amino acid 1 to about amino acid 24 of SEQ ID N0:5 (SEQ ID NO:101) 
preceding the mature human MANGO 003 protein which corresponds to about amino acid 
25 to amino acid 504 of SEQ ID N0:5 (SEQ ED NO:102). 
^ Human MANGO 003 is a transmembrane protein having an extracellular domain 

which extends from about amino acid 25 to about amino acid 374 of SEQ ID N0:5 (SEQ 
ID NO:103), a transmembrane domain which extends from about amino acid 375 to about 
amino acid 398 of SEQ ID N0:5 (SEQ ID NO:104), and a cytoplasmic domain which 
extends from about amino acid 399 to amino acid 504 of SEQ ID NO:5 (SEQ ID NO:105). 

Alternatively, in another embodiment, a human MANGO 003 protein contains an 
extracellular domain which extends from about amino acid 399 to ammo acid 504 of SEQ 
ID N0:5 (SEQ ID NO: 105), a transmembrane domain which extends from about amino 
acid 375 to about amino acid 398 of SEQ ID NO:5 (SEQ ID NO:104), and a cytoplasmic 
domain which extends from about amino ajcid 25 to about amino acid 374 of SEQ ID NO:5 
(SEQIDNO:103). 

Human MANGO 003 includes three immunoglobulin domains at amino acids 44- 
101 of SEQ ID NO:5 (SEQ ID NO:38); amino acids 165-223 of SEQ ID N0:5 (SEQ ID 
NO:39); and amino acids 261-340 of SEQ ID NO:5 (SEQ ID NO:40). Figure 6 depicts 
alignments of each of the immunoglobulin domains of MANGO 003 with a consensus 
hidden Markov model knmunoglobulm domain (SBQ ID NO:37), 

Human MANGO 003 includes a neurotransmitter gated ion channel domain at 
ammo acids 388-397 of SEQ ID NO:5 (SEQ ID NO:43). Figure 7 depicts an alignment of 
the neurotransmitter gated ion channel domain of human MANGO 003 with a 
neurotransmitter gated ion channel domain derived from a liidden Markov model (SEQ ID 
NO:42). 

N-glycosylation sites are present at amino acids 11 1-1 14, 231-234, 255-258, and 
293-296 of SEQ ID N0:5. A cAMP and cGMP-dependent protein kinase phosphorylation 
site is present at amino acids 202-205 of SEQ ID N0:5. Protein kinase C phosphorylation 
sites are present at amino acids 44-48, 167-169, 207-209, 216-218, 220-222, 224-226, 233- 
235, 347-349, and 422-424 of SEQ ID N0:5. Casein kinase n phosphorylation sites are 
present at amino acids 192-195, 256-259, 294-297, 313-316, 422-425, and 490-493 of SEQ 
ID NO:5. Tyrosine kinase phosphorylation sites are present at amino acids 212-219 and 
329-336 of SEQ ID N0:5. N-myristylation sites are present at amino acids 95-100, 228- 
233, 261-266, 317-322, 334-339, 382-387, and 443-448 of SEQ ID NO:5. 

Clone jthYa030d03, which encodes human MANGO 003, was deposited as a 
composite deposit having a designation EpthLa6al with the American Type Culture 
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Collection (ATCC® 10801 University Boulevard, Manassas, VA 20110-2209) on March 27, 
1999 and assigned Accession Nximber 207178. This deposit will be maintained under the 
terms of the Budapest Treaty on the Interaational Recognition of the Deposit of 
Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a 
convenience for those of skill ia the art and is not an admission that a deposit is required 
^ under 35 U.S.C §112. 

Figure 5 depicts a hydropathy plot of human AlANGO 003 « Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 5 indicates the presence of a 
hydrophobic domain within human MANGO 003, suggesting that human MANGO 003 is a 
transmembrane protein* 

A cDNA encoding mouse MANGO 003 was identified by analyzing the sequences 
of clones present in a mouse choroid plexus cDNA library. 

This analysis led to tibie identification of a clone, jflnji004cl 1 , encoding partial 
mouse MANGO 003. The cDNA of this clone is 504 nucleotides long (Figures 8 A-8B; 
SEQ ID N0:7). The 626 nucleotide open reading frame of this cDNA, nucleotides 1-626 of 
SEQ ID N0:7 (SEQ ID NO:9), encodes a 208 amino acid protein (Figures 8A-8B; SEQ ID 
NO:8). 

Northern blot analysis using the mouse clone jfinjfD04cl 1 revealed strong 
expression of the mouse MANGO 003 gene in the mouse liver, skeletal muscle and kidney. 
Moderate expression was detected in the heart, lung and testis, and lower levels of 
expression were detected in the mouse brain. No expression was detected in the spleen. 

Mouse MANGO 003 that has not been post-translationally modified is predicted to 
have a molecular weight of 22.3 kDa. 

Mouse MANGO 003 is a transmembrane protein having an extracellular domain 
which extends firom about amino acid 1 to about amino acid 73 of SEQ ID N0:8 (SEQ ID 
NO: 107), a transmembrane domain which extends from about amino acid 74 to about 
amino acid 96 of SEQ ID N0:8 (SEQ ID NO:108), and a cytoplasmic domain which 
ext^ds from about amino acid 97 to amino acid 208 of SEQ ID N0:8 (SEQ ID NO:109). 

An N-glycosylation site is present at amino acids 190-193 of SEQ ID NO:8. Protein 
kinase C phosphorylation sites are present at amino acids 44-46, 98-100, 119-121, and 197- 
199 of SEQ ID NO:8. Casein kinase n phosphorylation sites are present at amino acids 10- 
13, and 119-122 of SEQ ID N0:8. A tyrosine kinase phosphorylation site is present at 
amino acids 26-33 of SEQ ID NO:8. N-myristylation sites are present at ammo acids 14- 
19, 31-36, and 79-84 of SEQ ID NO:8. 
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Figure 9 depicts a hydropathy plot of mouse MANGO 003 . Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropatliy plot of Figure 9 indicates the presence of a 
hydrophobic domaia within human MANGO 003, suggesting that human MANGO 003 is a 
transmembrane protein. 

A global aligiment between the nucleotide sequence of the open reading frame 
(ORF) of human MANGO 003 (SEQ ID N0:6) and the nucleotide sequence of the open 
reading frame of mouse MANGO 003 (SEQ ID NO:9) revealed a 31.1% identity (Figures 
27A-27C). The global alignment was performed using the ALIGN program version 2.0u 
(Matrix file used: pam 120,mat, gap penalties of -12/-4 with a global alignment score of 
-"1212; Myers and Miller, 1989 CABIOS 4:1U7). 

A local alignment between the nucleotide sequence of human MANGO 003 (SEQ 
ID NO:4) and the nucleotide sequence of mouse MANGO 003 (SEQ ID N0:7) revealed a 
62.8 % identity over nucleotides 970-2080 of the human MANGO 003 sequence 
(nucleotides 10-1070 of mouse MANGO 003) (Figures 28A-28B). The local alignment was 
performed using the L- ALIGN program version 2.0u54 July 1996 (Matrix file used: pam 
120.mat, gap penalties of -12A4 with a score of 3241; Huang and Miller, 1991, Adv. Appl. 
Math 12:373-81). 

A global alignment between the amino acid sequence of human MANGO 003 (SEQ 
ID N0:5) and the amino acid sequence of mouse MANGO 003 (SEQ ID N0:8) revealed a 
30.1% identity (Figure 29). The global alignment was perfomied using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global . 
ahgnment score of -^88; Myers and Miller, 1989, CABIOS A:ll-1), 



Use of MANGO 003 Nucleic Acids. Polypeptides^ and Modulators Thereof 

MANGO 003 includes three irommioglobulin-like domains. Proteins having such 
domains play a role in mediating protein-protein and protein-ligand interactions, and thus 
can influence a wide variety of biological processes, including cell surface recognition; 
transduction of an extracellular signal {e.g., by interacting with a ligand and/or a cell- 
surface receptor); and/or modulation of signal transduction pathways. 

MANGO 003 further includes a neurotransmitter-gated ion channel domain, 
Proteim having such domains play a role in modulating signal transmission at chemical 
synapses by, for example, influencing processes, such as the release of neurotransmitters 
from a cell (^.g., a neuronal cell); modulating membrane excitability and/or resting 
potential; and/or modulating ion flux across a membrane of a cell (^.g,, a neuronal or a 
muscle cell). Because MANGO 003 includes a neurotransmitter-gated ion channel domain, 
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MANGO 003 polypeptides, nucleic acids, and modulators thereof can be used to treat 
neural disorders (e.g. j a CNS disorder, including Alzheimer's disease, Pick's disease, 
Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral 
sclerosis, progressive supranuclear palsy, epilepsy, and Jakob-Creutzfieldt disease; 
psychiatric disorders, e.g., depression, schizophrenic disorders, Korsakoff s psychosis, 

^ mania, anxiety disorders, or phobic disorders; learning or memory disorders, e.g., amnesia 
or age-related memory loss; and neurological disorders, e.g., migraine). 

MANGO 003 polypeptides, nucleic acids, and modulators thereof can be used to 
modulate function, survival, morphology, migration, proliferation and/or differentiation of 
cells in the tissues in which it is expressed (e.g. thyroid, liver, skeletal muscle, kidney, 
heart, lung, testis and brain). For example, MANGO 003 polypeptides, nucleic acids, and 
modulators thereof can be used to modulate endocrine, hepatic, skeletal muscular, renal, 
cardiac, reproductive and/or brain function. Accordingly, these molecules can be used to 
treat a variety of disease including, but not limited to, endocrine disorders (e.g., 
hypothyroidism, hyperthyroidism, dwarfism, giantism, acromegaly); hepatic disorders (e.g., 

^ ^ hepatitis, liver cirrhosis, hepatoma, liver cysts, and hepatic vein thrombosis); skeletal 
muscular disorders; renal disorders (e.g., renal cell carcinoma, nephritis, polycystic kidney 
disease); cardiovascular disorders (e.g., atherosclerosis, ischemia reperfusioh injury, cardiac 
hypertrophy, hypertension, coronary artery disease, myocardial hifarction, arrhythmia, 
cardiomyopathies, and congestive heart failure); and/or reproductive disorders (e.g., 
sterility). 

MANGO 003 polypeptides, nucleic acids, or modulators thereof, can be used to treat 
hepatic (liver) disorders, such as jaundice, hepatic failure, hereditary hyperbiliruinemias 
(e.g., Gilbert's syndrome, Crigler-Naijar syndromes and Dubin- Johnson and Rotor's 
s3mdromes), hepatic circulatory disorders (e.g., hepatic vein thrombosis and portal vein 
•^■^ obstruction and thrombosis) hepatitis (e.g., chronic active hepatitis, acute viral hepatitis, and 
toxic and drug-induced hepatitis) cirrhosis (e.g., alcoholic cirrhosis, biliary cirrhosis, and 
hemochromatosis), or malignant tumors (e.g., primary carcinoma, hepatoblastoma, and 
angiosarcoma). 

In another example, MANGO 003 polypeptides, nucleic acids, or modulators 
thereof, can be used to treat disorders of skeletal muscle, such as muscular dystrophy (e.g., 
Duchenne Muscular Dystrophy, Becker Muscular Dystrophy, Emery-Dreifuss Muscular 
Dystrophy, Limb-Girdle Muscular Dystrophy, Facioscapulohumeral Muscular Dystrophy, 
Myotonic Dystrophy, Oculopharyngeal Muscular Dystrophy, Distal Muscular Dystrophy, 
and Congenital Muscular Dystrophy), motor neuron diseases (e.g., Amyotrophic Lateral 
Sclerosis, Infantile Progressive Spinal Muscular Atrophy, Intermediate Spinal Muscular 
Atrophy, Spinal Bulbar Muscular Atrophy, and Adult Spmal Muscular Atrophy), 
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myopathies (e.g., inflammatory myopathies Dermatomyositis and Polymyositis), 
Myotonia Congenita, Paramyotonia Congenita, Central Core Disease, Nemaline Myopathy, 
Myotubular Myopathy, and Periodic Paralysis), and metabolic diseases of muscle (e.g., 
Phosphorylase Deficiency, Acid Maltase Deficiency, Phosphojfructoldnase Deficiency, 
Debrancher Enzyme Deficiency, Mitochondrial Myopathy, Carnitine Deficiency, Carnitine 
Palmityl Transferase Deficiency, Phosphoglycerate Kinase Deficiency, Phosphoglycerate 
Mutase Deficiency, Lactate Dehydrogenase Deficiency, and Myoadenylate Deaminase 
Deficiency). 

In another example, MANGO 003 polypeptides, nucleic acids, or modulators 
thereof, can be used to treat renal disorders, such as glomerular diseases (e.g.^ acute and 
chronic glomerulonephritis, rapidly progressive glomerulonephritis, nephrotic s3^drome, 
focal proliferative glomerulonephritis, glomerular lesions associated with systemic disease, 
such as systemic lupus erythematosus, Goodpasture's syndrome, multiple myeloma, 
diabetes, neoplasia, sickle cell disease, and chronic inflammatory diseases), tubular diseases 

acute tubular necrosis and acute renal failure, polycystic renal diseasemeduUary 
sponge kidney, medullary cystic disease, nq)hrogenic diabetes, and renal tubular acidosis), 
tubulointerstitial diseases {e.g,, pyelonephritis, drug and toxin induced tubulouiterstitial 
nephritis, hypercalcemic nephropathy, and hypokalemic nephropathy) acute and rapidly 
progressive renal failure, chronic renal failure, nephrolithiasis, vascular diseases (e.g., 
hypertension and nephrosclerosis, microangiopathic hemolytic anemia^ atheroembolic renal 
disease, diffuse cortical necrosis, and renal infarcts), or tumors renal cell carcinoma 
and nephroblastoma). 

Further, in light of MANGO 003 's pattern of expression in mice, MANGO 003 
expression can be utilized as a marker for specific tissues liver, skeletal muscle, 
kidney) and/or cells (e.g., hepatic, skeletal muscle, renal) in which MANGO 003 is 
expressed. MANGO 003 nucleic acids can also be utilized for chromosomal mapping. 



MANGO 347 

A cDNA encoding himian MANGO 347 was identified by analyzing the sequences 
of clones present in a human brain cDNA library. 

This analysis led to the identification of a clone, jlhbad295gl2, encoding full-length 
human MANGO 347. The cDNA of this clone is 1423 nucleotides long (Pigure 10; SEQ 
ID NO:10). The 414 nucleotide open reading firame of this cDNA, nucleotides 31 to 444 of 
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SEQ ID NO:10 (SEQ ID lsrO:12), encodes a 138 amino acid protein (Figure 10; SEQ ID 
NOrll). 

The signal peptide prediction program SIGNALP (Nielsen et al., 1997, Protein 
Engineering 10: 1-6) predicted that human MANGO 347 includes a 35 amino acid signal 
peptide at amino acid 1 to about amino acid 35 of SEQ ID NOtll (SEQ ID N0:1 10) 
^ preceding the mature human MANGO 347 protein which corresponds to about amino acid 
36 to amino acid 138 of SEQ ID NOrll (SEQ ID NO: 111). 

Human MANGO 347 that has not been post-translationally modified is predicted to 
have a molecular weight of 1 5 .4 KDa prior to cleavage of its signal peptide and a molecular 
weight of 1 1 3 kDa subsequent to cleavage of its signal peptide. 

Human MANGO 347 includes a CUB domain at amino acids 40-136 of SEQ ID 
N0:11 (SEQ ED NO:45). An alignment of the CUB domain of human MANGO 347 with a 
consensus hidden Markov model CUB domiain amino acid sequence derived from a hidden 
Markov model (SEQ ID NO:44) is shown in Figure 12. 

Casein kinase II phosphorylation sites are present at amino acids 67-70, and 108-1 1 1 
ofSBQIDNO:lL N-myristylation sites are present at amino acids 19-24, 31-36, 64-69, 
and 113-118 of SEQ ID N0:11. 

Clone jlhbad295gl2, which encodes human MANGO 347, was deposited as a 
composite deposit having a designation EpM347 with the American Type Culture 
Collection (ATCC® 10801 University Boulevard, Manassas, VA 20110-2209) on June 18, . 
1999 and assigned Accession Number PTA-250. A description of the deposit conditions 
used in set forth below. This deposit will be maintained under the temis of the Budapest 
Treaty on the Latemational Recognition of the Deposit of Microorganisms for the Purposes 
of Patent Procedure. This deposit was made merely as a convenience for those of skill in 
the art and is not an admission that a deposit is required under 35 U.S.C. §112. 

Figure 1 1 depicts a hydropathy plot of human MANGO 347. Relatively 
hydrophobic regions are above the horizontal line, and relatively hydrophilic regions are 
below the horizontal line. The cysteine residues (cys) are indicated by short vertical lines 
just below the hydropathy trace. The hydropathy plot of Figure 1 1 indicates that human 
MANGO 347 has a signal peptide at its amino terminus, suggesting that human MANGO 
347 IS a secreted protein 

Use of MANGO 347 Nucleic Acids, Polypeptides, and Modulators Thereof 

MANGO 347 includes a CUB domain. Proteins having such a domain play a role in 
mediating cell interactions during development, and thus can influence a wide variety of 
developmental processes, including morphogenesis, cellular migration, adhesion, 
proliferation, differentiation, and/or survival. MANGO 347 polypeptides are e^qpressed in 
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neural (eg., brain cells)* Because MANGO 347 includes a CUB domain and is expressed 
in neural cells, MANGO 347 polypeptides, nucleic acids, and modulators thereof can be 
used to treat disorders involving, e.g., cellular migration, proliferation, and differentiation of 
a cell, e.g., a neural cell (e.g., a CNS disorder, including Alzheimer's disease, Pick*s disease, 
Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral 
sclerosis, progressive supranuclear palsy, epilepsy, and Jakob-Creutzfieldt disease; 
psychiatric disorders, e.g., depression, schizophrenic disorders, Korsakoff s psychosis, 
mania, anxiety disorders, or phobic disorders; learning or memory disorders, eg-., anmesia 
or age-related memory loss; and neurological disorders, eg,, migraine). 

Further, in light of MANGO 347's presence in a human brain cDNA library, 
MANGO 347 expression can be utiUzed as a marker for speciiac tissues (e.g., brain) and/or 
cells (ag, brain) in which MANGO 347 is expressed MANGO 347 nucleic acids can also 
be utilized for chromosomal mapping. 

TANGO 272 

A cDNA encoding human TANGO 272 was identified by analyzing the sequences 
of clones present in a human microvascular endothelial cell library (HMVEC) cDNA 
library. 

This analysis led to the identification of a clone, jthda089hO3, encoding full-length 
human TANGO 272. The cDNA of this clone is 5036 nucleotides long (Figures 13A-13D; 
SBQ ID NO: 13). The 3149 nucleotide open reading fi-ame of this cDNA, nucleotides 230- 
3379 of SEQ ID N0:13 (SBQ ID N0:15), encodes a 1050 amino acid protein (Figures 13A- 
13D;SEQIDNO:14). 

Northern blot analysis using the human clone jthda089h03 revealed strong 
expression of the human TANGO 272 gene in the heart. Moderate expression was detected 
in the placenta, lung, and liver, and lower levels of expression were detected in the braia, 
skeletal muscle, kidney, and pancreas. 

The signal peptide prediction program SIGNALP (Nielsen et al., 1997, Protein 
Engineering 10:1-6) predicted that human TANGO 272 includes an 20 amino acid signal 
peptide at amino acid 1 to about amino acid 20 of SEQ ID N0:14 (SEQ ID NO:112) 
preceding the mature human TANGO 272 protein which corresponds to about amino acid 
21 to amino acid 1050 of SEQ ID NO:14 (SEQ ID N0:113). 

Human TANGO 272 that has not been post-translationally modified is predicted to 
have a molecular weight of 1 12 kDa prior to cleavage of its signal peptide and a molecular 
weight of 1 10 kDa subsequent to cleavage of its signal peptide. 

Human TANGO 272 is a transmembrane protein having an extracellular domain 
which extends from about amino acid 21 to about amino acid 767 of SEQ ID NO: 14 (SEQ 
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ID NO: 1 14), a transmembrane domain which extends from about amino acid 768 to about 
amino acid 791 of SEQ ID N0:14 (SEQ ID N0:1 15), and a cytoplasmic domain which 
extends from about amino acid 792 to amino acid 1050 of SEQ ID NO: 14 (SEQ ID 
N0:116). 

Alternatively^ in another embodiment, a human TANGO 272 protein contains an 
extracellular domain which extends from about amino acid 792 to amino acid 1050 of SEQ 
ID Nb;14 (SEQ ID NOr l 16), a transmembrane domain which extends from about amino 
acid 768 to about amino acid 791 of SEQ ID NO: 14 (SEQ ID NO: 1 1 5), and a cytoplasmic 
domain which extends from about amino acid 21 to about amino acid 767 of SEQ ID 
NO:14(SEQIDNO:114). 

Human TANGO 272 includes fourteen EGF-like domains at amino acids 151-181 of 
SEQ ID N0:14 (SEQ ID NO:49); amino acids 200-229 of SEQ ID NO:14 (SEQ ID 
NQi50); amino acids 242-272 of SEQ JD N0:14 (SEQ ID NO:51); amino acids 285-315 of 
SEQ ID N0:14 (SEQ ID NO:52); amino acids 328-358 of SEQ DD NO:14 (SEQ ID 
NO:53); amino acids 378-404 of SEQ ID N0:14 (SEQ ID NO:54); amino acids 417-447 of 
SEQ ID N0:14 (SEQ ID NO:55); amino acids 460-490 of SEQ ID N0:14 (SEQ ID 
N0:56); amino acids 503-533 of SEQIDN0:14 (SEQID NO:57); amino acids 546-576 of 
SEQ ID N0:14 (SEQ ID NO:58); amino acids 589-619 of SEQ ID NO:14 (SEQ ID 
NO:59); amino acids 632-661 of SEQ ID NO: 14 (SEQ ID NO:60); amino acids 674-704 of 
SEQ ID N0:14 (SEQ ID NO;61); atid amino acids 717-747 of SEQ ID NO:14 (SEQ ID 
NO:62). Figures 15A-15C depict aligmnents of each of the EGF-like domains of TANGO 
272 with consensus hidden Markov model EGF-like domains (SEQ ID NO:46). Human 
TANGO 272 further includes a delta serrate ligand domain from amino acids 518 to 576 of 
SEQ ID N0:14 (SEQ ID NO:63). An alignment of the delta serrate ligand domain of 
human TANGO 272 with a consensus hidden Markov model of this domain (SEQ ID 
NO:47) is also depicted (Figure 15B). 

An RGD cell attachment site is present at amino acids 177-179 of SEQ ID NO: 14. 
N-glycosylation sites are present at amnio acids 284-287, 405-408, 459-462, 489-492, 504- 
507, 588-591, 639-642, 647-650, 716-719, and 873-876 of SEQ ID NO:14. An amidation 
site is present at amino acids 628-631 of SEQ ID N0:14. Protein kinase C phosphorylation 
sites are present at amino acids 38-40, 70-72, 107-109, 359-361, 461-463, 594-596, 809- 
811, 896-898, 940-942, 977-979, and 1022-1024 of SEQ ID NO:14. Casein kinase H 
phosphorylation sites are present at amino acids 30-33, 38-41, 473-476, 548-551, 579-582, 
657-660, 897-900, 921-924, 940-943, and 955-958 of SEQ ID NO:14. A tyrosine kinase 
phosphorylation site is present at amino acids 361-368 of SEQ ID NO:14. N-myristylation 
sites are present at amino acids 14-19, 103-108, 269-274, 302-307, 325-330, 345-350, 401- 
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406, 427-432, 434-439, 457-462, 520-525, 586-591, 606-611, 648-653, 707-712, 714-719, 
769-774, 866-871, 926-931, and 1014-1019 of SEQ ID N0:14. 

Clone jtlida089h03, which encodes human TANGO 272, was deposited as a 
composite deposit having a designation EpT272 with the American Type Culture Collection 
(ATCC® 10801 University Boulevard, Manassas, VA20110-2236) June 18, 1999 and 
assigned Accession Number PTA-250. A description of the deposit conditions used is set 
forth in the section entitled 'TDeposit of Clones" below. This deposit will be maintained 
under the terms of the Budapest Treaty on the International Recognition of the Deposit of 
Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a 
convenience for those of skill in the art and is not an admission that a deposit is required 
under 35 U.S.C. §112. 

Figure 14 depicts a hydropathy plot of human TANGO 272. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 16 indicates the presence of a 
hydrophobic domain within human TANGO 272, suggesting that human TANGO 272 is a 
transmembrane protein. 

A cDNA encoding mouse TANGO 272 was identified by analyzing the sequences of 
clones present in a mouse testis cDNA library. 

This analysis led to the identification of a clone, jtmzb062c04, encoding partial 
mouse TANGO 272. The cDNA of this clone is 2569 nucleotides long (Figures 16A-16B; 
SEQ ID NO:16). The 1492 nucleotide open reading frame of this cDNA, nucleotides 1- 
1492 of SEQ ID NO:16 (SEQ ID NO:18), encodes a 497 amino acid protein (Figures 16A- 
16B;SEQIDNO:17). 

Mouse TANGO 272 that has not been post-translationally modified is predicted to 
have a molecular weight of 53.5 kDa. 

Mouse TANGO 272 is a transmembrane protein having an extracellular domain 
which extends jfrom about amino acid 1 to about amino acid 216 of SEQ ID NO:17 (SEQ 
ID NO: 11 8), a transmembrane domain which ^tends jfrom about amino acid 217 to about 
amino acid 240 of SEQ ID N0:17 (SEQ ID NO:l 19), and a cytoplasmic domain which 
extends firom about amino acid 241 to amino acid 497 of SEQ ID NO: 17 (SEQ ID NO: 120). 

Alternatively, in another embodiment, a mouse TANGO 272 protein contains an 
extracellular domain which extends firom about amino acid 241 to amino acid 497 of SEQ 
ID NO:I7 (SEQ ID NO:120), a transmembrane domain which extends &om about amino 
acid 217 to about amino acid 240 of SEQ ID NO:17 (SEQ ID NO:119), and a cytoplasmic 
domain which extends jfrom about amino acid 1 to about amino acid 216 of SEQ ID N0:17 
(SEQIDN0:118). 
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Mouse TANGO 272 includes four EGF-like domains at about amino acids 37-67 of 
SEQ ID NO: 17 (SEQ E) NO:64); amino acids 80-1 10 of SBQ ID NO:17 (SEQ ID NO:65); 
amino acids 123-153 of SEQ ID NO:17 (SEQ ID NO:66); and amino acids 166-196 of SEQ 
ID N0:17 (SEQ ID NO:67), Mouse TANGO 272 further includes four laminin-EGF-like 
domains at about amino acids 3-37 of SEQ ID NO: 17 (SEQ ID NO:68); amino acids 41-80 
^ of SEQ ID N0:17 (SEQ ID NO:69); amino acids 83-123 of SEQ ID NO:17 (SEQ ID 
NO:70); and amino acids 127-172 of SEQ ID N0:17 (SEQ ID N0:71). Figures 39A-39B 
depict alignments of each of the EGF-like- and laminin-EQF-like domains of TANGO 272 
with consensus hidden Markov model BGF-like domains (SEQ ID NOs:46 and 48, 
respectively). 

Mouse TANGO 272 further includes a delta serrate ligand domain from amino acids 
10 to 67 of SEQ ID N0:17 (SEQ ID NO:72)» An alignment of the delta senrate ligand 
domain of mouse TANGO 272 with a consensus hidden Markov model of this domain 
(SEQ ID NO:47) is also depicted in Figures 39A-39B* 

Based on the Prosite analysis, EGF-like domain cysteine pattern signature are 
present at amino acids 13-24, 56-67, 99-1 10, 142-153, and 185-196 of SEQ ID N0:17. 

N-glycosylation sites are present at amino acids 36-39, 88-91, 165-168, and 323-326 
of SEQ ID N0:17. An amidation site is present at amino acids 76-79 of SEQ ID N0:17. 
Protein kinase C phosphorylation sites are present at amino acids 42-44, 258-260, 354-356, 
388-390, 469-471, and 492-494 of SEQ ID NO:17. Casein kinase 11 phosphorylation sites 
are present at amino acids 106-109, 192-195, 343-346, 388-391, and 446-449 of SEQ ID 
N0:17» N-myristylation sites are present at amino acids 11-16, 34-39, 47-52, 54-59, 97- 
102, 120-125, 140-145, 163-168, 199-204,218-223, 372-377, and 461-466 of SEQ ID 
N0:17. 

Figure 17 depicts a hydropathy plot of mouse TANGO 272. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
die hydropathy trace. The hydropathy plot of Figure 17 indicates the presence of a 
hydrophobic domain within mouse TANGO 272, suggesting that mouse TANGO 272 is a 
transmembrane protein. 

30 

A cDNA encoding rat TANGO 272 was identified by analyzing the sequences of 
clones present in a rat neonatal sciatic nerve cDNA library. 

This analysis led to the identification of a clone, atrxa6b6, encoding partial rat 
TANGO 272. The cDNA of this clone is 3567 nucleotides long (Pigmies 33A-33C; SEQ ID 
NO:19). The 1908 nucleotide open reading frame of this cDNA, nucleotides 925-2832 of 
SEQ ID NO: 19 (SEQ ID NO:21), encodes a 636 amino acid protein (Figures 33 A-33C; 
SEQIDNO:20). 
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Rat TANTGO 272 that has not beea post-translationally modified is predicted to have 
a molecular weight of 67.4 kDa. 

Rat TANGO 272 is a transmembrane protein having an extracellular domain which 
extends from about amino acid 1 to about amino acid 500 of SEQ ID NO:20 (SEQ ID- 
NO: 122), a traasmembrane domain which extends from about amino acid 501 to about 
amino acid 524 of SEQ ID NO:20 (SEQ ID NO: 123), and a cytoplasmic domain which 
extends from about amino acid 525 to amino acid 636 of SEQ ID NO:20 (SEQ ID NO: 124). 

Alternatively, in another embodiment, a rat TANGO 272 protein contains an 
extracellular domain which extends from about amino acid 525 to amino acid 636 of SEQ 
ID NO:20 (SEQ ED NO: 124), a transmembrane domain which extends from about amino 
acid 501 to about amino acid 524 of SEQ ID NO:20 (SEQ ID NO: 123), and a cytoplasmic 
domain which extends from about amino acid 1 to about amino acid 500 of SEQ ID NO:20 
(SEQIDNO:122). 

Rat TANGO 272 includes eleven EGF-like domains at about amino acids 18-48 of 
SEQ ID NO:20 (SEQ ID NO:73); amino acids 61-91 of SEQ ID NO:20 (SEQ ID NO:74); 
amino acids 105-137 of SEQ ID NO:20 (SEQ ID NO:75); amino acids 150-180 of SEQ ID 
NO:20 (SEQ ID NO:76); amino acids 193-223 of SEQ ID NO:20 (SEQ ID NO:77); amino 
acids 236-266 of SEQ ID NO:20 (SEQ ID NO:78); amino acids 279-309 of SEQ ID NO:20 
(SEQ ID NO:79); amino acids 322-352 of SEQ ID NO:20 (SEQ ID NO:80); amino acids 
365-394 of SEQ ID NO:20 (SEQ ID N0:81); amino acids 407-437 of SEQ ID NO:20 (SEQ 
ID NO:82); and amino acids 450-480 of SEQ ID NO:20 (SEQ ID NO:83). Figures 41 A- 
41D depict alignments of each of the EGF-like-domains of rat TANGO 272 with consensus 
hidden Markov model EGF-like domains (SEQ ID NO:46). 

Rat TANGO 272 ftirther includes eleven- laminin/EGF-like domains at about amino 
acids 22-61 of SEQ ID NO:20 (SEQ ID NO:84); ^ino acids 65-105 of SEQ ED NO:20 
(SEQ ID NO:85); amino acids 109-150 of SEQ ID NO:20 (SEQ ID NO:86); amino acids 
154-193 of SEQ ID NO:20 (SEQ ID NO:87); amino acids 197-236 of SEQ ID NO:20 (SEQ 
ID NO:88); amino acids 240-279 of SEQ ID NO:20 (SEQ ID NO:89); amino acids 283-322 
of SEQ ID NO:20 (SEQ ID NO:90); amino acids 326-3'65 of SEQ ID NO:20 (SEQ ID 
N0:91); amino acids 368-407 of SEQ ID NO:20 (SEQ ID NO:92); amino acids 411-450; 
and amino acids 454-489 of SEQ ID NO:20 (SEQ ID NO:93). Figures 41A-41D depict 
alignments of each of the laminiu/EGF-like-domains of rat TANGO 272 with consensus 
hidden Markov model EGF-like domains (SEQ ID NO:48). 

Rat TANGO 272 ftirther includes a delta serrate Ugand domain from amino acids 
246 to 309 of SEQ ID NO:20 (SEQ ID NO:95). An alignment of tiie delta serrate ligand 
domain of rat TANGO 272 with a consensus hidden Markov model of this domain (SEQ ID 
NO:47) is also depicted in Figures 41 A-41D. 
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Based on the Prosite analysis, EGF-^like domain cj^teine pattern signature are 
present at amino acids 37-48, 80-91, 126-137, 169-180, 255-266, 298-309, 341-352, 383- 
394, 426-437, and 469-480 of SEQ ID NO:20. 

N-glycosylation sites are present at amino acids 17-20, 138-141, 192-195, 222-225, 
237-240, 321-324, 372-375, 436-439, and 449-452 of SEQ ID NO:20. A cAMP/cGMP- 
dependent protein kinase phosphorylation site is present at amino acids 618-621 of SEQ ID 
NO:20. Anamidationsiteispresentatamino acids 361-364 of SEQ ID NO:20. Protein 
kinase C phosphorylation sites are present at amino acids 92-94, 327-329, 542-544, and 
596-598 of SEQ ID NO:20. Casein kinase II phosphorylation sites are present at amino 
acids 104-107, 206-209, 281-284, and 390-393 of SEQ ID NO:20. A tyrosine kinase 
phosphorylation site is present at amino acids 94-101 of SEQ ID NO:20. N-myristylation 
sites are present at amino acids 2-7, 35-40, 58-63, 78-83, 134-139, 160-165, 167-172, 190- 
195, 210-215, 253-258, 319-324, 339-344, 381-386, 404-409, 424-429, 447-452, 483-488, 
and 502-507 of SEQ ID NO:20. 

Figure 40 depicts a hydropathy plot of rat TANGO 272. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 40 indicates the presence of a 
hydrophobic domain within rat TANGO 272, suggesting that rat TANGO 272 is a 
transmembrane protein. 

A global aligmnent between the nucleotide sequence of the open reading frame 
(ORF) of human TANGO 272 (SEQ ID NO:15) and the nucleotide sequence of the open 
reading frame of mouse TANGO 272 (SEQ ID N0:18) revealed a 39.1% identity (Figures 
30A-30E). The global alignment was performed using the ALIGN program version 2.0u 
(Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global aligmnent score of 
~79;MyersandMiller, 1989, G45/054:ll-7). 

A local alignment between the nucleotide sequence of human TANGO 272 (SEQ ID 
NO: 13) and the nucleotide sequence of mouse TANGO 272 (SEQ ID NO: 16) revealed 67.6 
% identity over nucleotides 1890-4610 of the human TANGO 272 sequence (nucleotides 
10-2560 of mouse TANGO 272) (Figures 3 1 A-31D). The local aligmnent was performed 
using the L- ALIGN program version 2.0u54 July 1996 (Matrix file used: pam 120.mat, gap 
penalties of -12/-4 with a score of 8462; Huang and Miller, 1991, Adv. Appl Math. 12:373- 
81). 

A global alignment between the amino acid sequence of human TANGO 272 (SEQ 
ID NO: 14) and the amino acid sequence of mouse TANGO 272 (SEQ ID NO: 17) revealed a 
38.2% identity (Figures 32A-32B). The global aligmnent was performed using the ALIGN 
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program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of --19; Myers and Miller, 1989, CABIOS 4:1 1-7). 

A global alignment between the nucleotide sequence of human TANGO 272 (SEQ 
ID NO: 13) and the nucleotide sequence of rat TANGO 272 (SEQ ID NO: 19) revealed a 
55.7% identity (Figures 34A-34H). The global alignment was performed using the ALIGN 
^ program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12A4 with a global 
alignment score of 8635; Myers andMiller, 1989, CABIOS A^ll-l). 

A global alignment between the nucleotide sequence of mouse TANGO 272 (SEQ 
ID N0:16) and the nucleotide sequence of rat TANGO 272 (SEQ ID N0:19) revealed a 
43.7% identity (Figures 35A-35F)* The global alignment was performed using the ALIGN 
program version 2.0u (Matrix file used: pam 120.mat, gap penalties of -12/-4 with a global 
alignment score of 2827; Myers andMiller, 1989, CABIOS 4:11-7). 

Use of TANGO 272 Nuclei c Acids:Polvpeptides> and Modulators Thereof 

TANGO 272 includes fourteen BGF-like domains. Proteins having such domains 
^ ^ play a role in mediating protein-protein interactions, and thus can influence a wide variety 
of biological processes, including cell surface recognition; modulation of cell-cell contact; 
modulation of cell fate determination; and modulation of wound healing and tissue repair. 

TANGO 272 further includes an RGD cell attachment site. Proteins having such 
domains are typically extracellular matrix proteins such as coUagens, lamiiiin and 
'^^ fibronectin, among others (reviewed in Ruoslahti, 1996, Amu, Rev. Cell Dev. BioL 12:697- 
715). An RGD cell attachment site typically interacts {e.g.^ binds to) a cell surface receptor, 
such as an integrin receptor, and thus mediates a variety of biological processes, including 
cellular adhesion, migration, among others. 

Because TANGO 272 includes EGF-like domains and an RGD cell attachment site, 
TANGO 272 polypeptides, nucleic acids, and modulators thereof can be used to treat 
disorders involving, e.g.^ cellular migration, proliferation, and differentiation of a cell. For 
example, TANGO 272 polypeptides, nucleic acids, and modulators thereof can be used to 
treat neoplastic disorders, e»g*, cancer, tumor metastasis. 

TANGO 272 polypeptides, nucleic acids, and modulators thereof can be used to 

on 

modulate function, survival, morphology, migration, proliferation, tissue repair and/or 
differentiation of cells in the tissues in wliich it is expressed (e.g.^ microvascular endothelial 
cells). For example, TANGO 272 polypeptides, nucleic acids, and modulators thereof can 
be used to modulate cardiovascular function, and/or to promote wound healing and tissue 
repair (^.g., of the skin, cornea and mucosal lining). Accordingly, these molecules can be 
used to treat a variety of cardiovascular diseases including, but not linnted to, 
atherosclerosis, ischemia reperfusion injury, cardiac hypertrophy, hypertension, coronary 



-42- 



artery disease, myocardial infarction, arrhjHhmia, cardiomyopathies, and congestive heart 
failure. 

As TANGO 272 exhibits expression in the heart, TANGO 272 nucleic acids, 
proteins, and modulators thereof can be used to treat heart disorders, e,g,^ ischemic heart 
disease, atherosclerosis, hypertension, angina pectoris, Hypertrophic Cardiomyopathy, and 
^ congenital heart disease. 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat placental disorders, such as toxemia of pregnancy (eg., preeclampsia 
and eclampsia), placentitis, or spontaneous abortion. 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat pulmonary (Ixmg) disorders, such as atelectasis, cystic fibrosis, 
rheumatoid lung disease, pulmonary congestion or edema, chronic obstructive airway 
disease (ag-., emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis), diffuse 
interstitial diseases (e.g., sarcoidosis, pneumoconiosis, hypersensitivity pneumonitis, 
Goodpasture's syndrome, idiopathic pulmonary hemosiderosis, pulmonary alveolar 
proteinosis, desquamative interstitial pneumonitis, chronic interstitial pneumonia, fibrosing 
alveolitis, hamman-rich syndrome, puhnonary eosinophilia, diffuse interstitial fibrosis, 
Wegener's granulomatosis, lymphomatoid granulomatosis, and lipid pneumonia), or tumors 
(e.g., bronchogenic carcinoma, bronchiole vlveolar carcinoma, bronchial carcinoid, 
hamartoma, and mesenchymal tumors). 

In anothCT example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat hepatic (Uver) disorders, such as jaundice, hepatic failure, hereditary 
hyperbiUruinemias (eg,, Gilbert's syndrome, Crigler-Naijar syndromes and Dubin- Johnson 
arid Rotor's syndromes), hepatic circulatory disorders (e.g., hepatic vein thrombosis and 
portal vein obstruction and thrombosis) hepatitis (e.g., chronic active hepatitis, acute viral 
hepatitis, and toxic and drug-induced hepatitis) cirrhosis (e.g., alcoholic cirrhosis, biliary 
cirrhosis, and hemochromatosis), or malignant tumors (e.g., primary carcinoma, 
hepatoblastoma, and angiosarcoma)* 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat disorders of the brain, such as cerebral edema, hydrocephalus, brain 
herniations, iatrogenic disease (due to, eg., infection, toxins, or drugs), inflammations (eg., 
bacterial and viral meningitis, encephalitis, and cerebral toxoplasmosis), cerebrovascular 
diseases (eg., hypoxia, ischemia, and infarction, intracranial hemorrhage and vascular 
malformations, and hypertensive encephalopathy), and tumors (eg., neuroglial tumors, 
neuronal tumors, tumors of pineal cells, meningeal tumors, primary and secondary 
lymphomas, intracranial tumors, and meduUoblastoma), and to treat injury or trauma to the 
brain* 
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In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat disorders of skeletal muscle, such as muscular dystrophy (eg-., 
Duchenne Muscular Dystrophy, Becker Muscular Dystrophy, Emery-Dreifuss Muscular 
Dystrophy, Limb-Girdle Muscular Dystrophy, Facioscapulohumeral Muscular Dystrophy, 
Myotonic Dystrophy, Oculopharyngeal Muscular Dystrophy, Distal Muscular Dystrophy, 
and Congenital Muscular Dystrophy), motor neuron diseases (e.g*., Amyotrophic Lateral 
Sclerosis, Infantile Progressive Spinal Muscular Atrophy, Intemiediate Spinal Muscular 
Atrophy, Spinal Bulbar Muscular Atrophy, and Adult Spinal Muscular Atrophy), 
myopathies (e.gi, inflammatory myopathies Dennatomyositis and Polymyositis), 
Myotonia Congenita, Paramyotonia Congenita, Central Core Disease, Nemaline Myopathy, 
Myotubular Myopathy, and Periodic Paralysis), and metabolic diseases of muscle {e.g., 
Phosphorylase DeJQiciency, Acid Maltase Deficiency, PhosphojSnctokinase Deficiency, 
Debrancher Enzyme Deficiency, Mitochondrial Myopathy, Carnitine Deficiency, Carnitine 
Palmityl Transferase Deficiency, Phosphoglycerate Kinase Deficiency, Phosphoglycerate 
Mutase Deficiency, Lactate Dehydrogenase Deficiency, and Myoadenylate Deaminase 
Deficiency). 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat renal disorders, such as glomerular diseases (e.g., acute and chronic 
glomerulonephritis, rapidly progressive glomerulonephritis, nephrotic syndrome, focal 
proliferative glomerulonephritis, glomerular lesions associated with systemic disease, such 
as systemic lupus er3rthematosus, Goodpasture's syndrome, multiple myeloma, diabetes, 
neoplasia, sickle cell disease, and chronic inflammatory diseases), tubular diseases (e.g., 
acute tubular necrosis and acute renal feilure, polycystic renal diseasemedullary sponge 
kidtiey, medullary cystic disease, nephrogenic diabetes, and renal tubular acidosis), 
tubulointerstitial diseases (e.g., pyelonephritis, drug and toxin induced tubulointerstitial 
nephritis, hypercalcemic nephropathy, and hypokalemic nephropathy) acute and rapidly 
progressive renal failure, chronic renal failure, nephrolithiasis, vascular diseases (e.g., 
hypertension and nephrosclerosis, microangiopathic hemolytic anemia, atheroembolic renal 
disease, diffuse cortical necrosis, and renal infarcts), or tumors (e.g., renal cell carcinoma 
and nephroblastoma). 

In another example, TANGO 272 polypeptides, nucleic acids, or modulators thereof, 
can be used to treat pancreatic disorders, such as pancreatitis (e.g., acute hemorrhagic 
pancreatitis and chronic pancreatitis), pancreatic cysts (e.g., congenital cysts, pseudocysts, 
and benign or malignant neoplastic cysts), pancreatic tumors (e.g., pancreatic carcinoma and 
adenoma), diabetes mellitus (e.g., insulin- and non-insulin-dependent types, impaired 
glucose tolerance, and gestational diabetes), or islet cell tumors (e.g., insulinomas, 
adenomas, ZoUinger-Ellison syndrome, glucagonomas, and somatostatinoma). 
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Further, in light of TANGO 272's pattern of expression in humans, TANGO 272 
expression can be utilized as a marker for specific tissues cardiovascular) and/or cells 
(e.^., cardiac) in which TANGO 272 is expressed. TANGO 272 nucleic acids can also be 
utilized for chromosomal mapping, 

^ TANGO 295 

A cDNA encoding human TANGO 295 was identified by analyzing the sequences 
of clones present in a human inammaiy epithelium cDNA library. 

This analysis led to the identification of a clone, jthvb023d09, encoding full-length 
human TANGO 295. The cDNA of this clone is 1497 nucleotides long (Figure 18; SEQ ID 
NO:22). The 468 nucleotide open reading fi:ame of this cDNA, nucleotides 217-684 of 
SEQ ID NO:22 (SEQ ID NO:34), encodes a 156 amino acid protein (Figure 18; SEQ ID 
NO:23). 

The signal peptide prediction program SIGNALP (Nielsen et al., 1997, Protein 
Engineering 10:1-6) predicted that human TANGO 295 includes a 28 amino acid signal 
peptide at amino acid 1 to about amino acid 28 of SEQ ID NO:23 (SEQ ID.NO:125) 
preceding the mature human TANGO 295 protein wliich corresponds to about amino acid 
29 to amino acid 156 of SEQ ID NO:23 (SEQ ED NO:126). 

Human TANGO 295 that has not been post-translationally modified is predicted to 
have a molecular weight of 17.5 kDa prior to cleavage of its signal peptide and a molecular 
weight of 14,6 kDa subsequent to cleavage of its signal peptide. 

Secretion assays reveal that human TANGO 295 protein is secreted as a 17 kDa 
protein. The secretion assays were performed as follows: 8x10^ 293T cells were plated per 
well in a 6-well plate and the cells were incubated in growth medium (DMEM, 10% fetal 
bovine serum, penicillin/streptomycin) at 37°C, 5% CO2 overnight. 293T cells were 
transfected with 2 jug of fulHength MANGO 245 inserted in the pMET7 vector/well and 10 
^g LipofectAMINE (GIBCO/BRL Cat. # 18324^012) /well according to the protocol for 
GIBCO/BRL LipofectAMINE. The transfectant was removed 5 hours later and JSresh 
growth medium was added to allow the cells to recover overnight. The medium was 
removed and each well was gently washed twice with DMEM without methionine and 
cysteine (ICN Cat. # 16-424-54). 1 ml DMEM without methionine and cysteine with 50 
^Ci Trans-^^S (ICN Cat. # 51006) was added to each well and the cells were incubated at 
37''Cj 5% CO2 for the appropriate time period. A 150 [il aliquot of conditioned medium 
was obtained and 1 50 |ll1 of 2X SDS sample buffer was added to the aliquot The sample 
was heat-inactivated and loaded on a 4-20% SDS-PAGB gel. The gel was fixed and the 
presence of secreted protein was detected by autoradiography. 
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Human TANGO 295 includes a pancreatic ribonuclease domain at amino acids 32- 
156 of SEQ ID NO:23 (SEQ ID NO:97). Figure 20 depicts an aligmnent of pancreatic 
ribonuclease domain of human TANGO 295 with a consensus hidden Markov model 
pancreatic ribonuclease domain (SEQ ID NO:96). 

An N-glycosylation site is present at amino acids 127-130 of SEQ ID NO:23. A 
^ cAMP/cGMP dependent protein kinase site is present at amino acids 139-142 of SEQ ID 
NO:23. Protem kinase C phosphorylation sites are present at amino acids 27-29, 62-64, 85- 
87, and 113-115 of SEQ ID NO:23, N-myristylation sites are present at amino acids 18-23, 
and 32-37 of SEQ ID NO:23. 

Global alignment of the human TANGO 295 and GeiiPept AF037081 amino acid 
sequences revealed 53.2% identity (Matrix file used: pam 120.mat, gap penalties of -12A4; 
Myers and Miller, 1989, CABIOS 4: 1 1-7) (Figure 36). A global aUgnment of the human 
TANGO 295 and GenPept AF037081 nucleotide sequences revealed a 22.6% identity 
between these two sequences (Figures 37A-37C) (Matrix file used: pam 120.mat, gap 
penalties of -12/-4 with a global alignment score of -2718; Myers and Miller, 1989, 
CABIOS4:lU7). 

Local alignment of the human TANGO 295 and Genbank AF037081 nucleotide 
sequences revealed 62.7% identity between nucleotides 235-687 of human TANGO 295, 
and nucleotides 3-453 of AF037081; 43.4% identity between nucleotides 410-850 of human 
TANGO 295, and nucleotides 3-450 of AF037081; and 46.5% identity between nucleotides 
432-700 of human TANGO 295, and nucleotides 5-251 of AF037081 (Matrix file used: 
pam 120.mat, gap penalties of -12/-4 with a global aligmnent score of 1214; Huang and 
Miller, 1991, Adv. Appl Math. 12:373-81) (Figures 38A-38B). 

Clone jthvb023d09, which encodes human TANGO 295, was deposited as a 
composite deposit having a designation EpT295 with the American Type Culture Collection 
(ATCC® 10801 University Boulevard, Manassas, VA 20110-2209) on June 18, 1999 and 
assigned Accession Number PTA-249. Deposit conditions are described below in the 
section entitled "Deposit of Clones". This deposit will be maintained under the terms of the 
Budapest Treaty on the Intemational Recognition of the Deposit of Microorganisms for the 
Purposes of Patent Procedure. This deposit was made merely as a convenience for those of 
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skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112. 

Figure 19 depicts a hydropathy plot of human TANGO 295. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydropliihc regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 19 indicates that human TANGO 295 
has a signal peptide at its amino terminus, suggesting that human TANGO 295 is a secreted 
protein. 
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Use of TANGO 295 NucleiG Acids> Polypeptides, and Modulators Thereof 

TANGO 295 includes a pancreatic ribonuclease domain. Proteins having such 
^ domains have pyrimidine-specij&c endonuclease activity, and are present at elevated levels 
in the pancreas of various mammals and few reptiles. TANGO 295 shows some structural 
similarities to Ribonuclease k6 (RNase k6). KNase k6 is expressed in human monocytes 
and monophils (but not in eosinophils), suggesting a role for this ribonuclease in regulating 
host defense. Based on the structural similarities between TANGO 295 and RNase k6, 
^ ^ TANGO 295 may play a role in regulating host defense. 

TANGO 295 polypeptides, nucleic acids, and modulators thereof, can be used to 
modulate the function, morphology, proliferation and/or differentiation of cells in the 
tissues in which it is expressed (e.g-., mammary epithelium). Accordingly, TANGO 295 
polypeptides, nucleic acids, and modulators thereof can be used to treat epithelial disorders, 
e.g., mammary epithelial disorders {e.g., breast cancer). 

Further, in light of TANGO 295 's presence in a hiunan mamary epithelium cDNA 
library, TANGO 295 expression can be utilized as a marker for specific tissues (e.g., breast) 
and/or cells (eg-., mammary) m which TANGO 295 is expressed. TANGO 295 nucleic 
acids can also be utilized for chromosomal mapping. 

20 

TANGO 354 

A cDNA encoding human TANGO 354 was identified by analyzing the sequences 
of clones present in a Mixed Lymphocjrte Reaction (MLR) cDNA library. 

This analysis led to the identification of a clone, jthLa042a04, encoding full-length 
25 human TANGO 354. The cDNA of this clone is 1788 nucleotides long (Figures 21A-21B; 
SEQ JD NO:25). The 915 nucleotide open reading frame of this cDNA, nucleotides 62-976 
of SEQ ID NO:25 (SEQ ID NO:27), encodes a 305 amino acid protein (Figures 21A-21B; 
SEQIDNO:26). 

Human TANGO 354 that has not been post-translationally modified is predicted to 
have a molecular weight of 33.8 kDa prior to cleavage of its signal peptide (31.6 kDa after 
cleavage of its signal peptide). 

The signal peptide prediction program SIGNALP (Nielsen et al, 1997, Protein 
Engineering 10:1-6) predicted that human TANGO 354 includes a 19 amino acid signal 
peptide at amino acid 1 to about amino acid 19 of SEQ ID NO:26 (SEQ ID NO: 127) 
preceding the mature human TANGO 354 protein which corresponds to about amino acid 
20 to amino acid 305 of SEQ ID NO:26 (SEQ ID NO:128). 
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Human TANGO 354 is a transmembrane protein having an extracellular domain 
which extends jfrom about amino acid 20 to about amino acid 169 of SEQ ID NO:26 (SEQ 
ID NO:129), a transmembrane domain which extends from about amino acid 170 to about 
amino acid 193 of SEQ ID NO:26 (SEQ ID NO:130), and a cytoplasmic domain which 
extends from about amino acid 194 to amino acid 305 of SEQ ID NO:26 (SEQ ID NO: 131). 
^ Alternatively, in another embodiment, a himian TANGO 354 protein contains an 

extracellular domain which extends from about amino acid 194 to amino acid 305 of SEQ 
ID NO:26 (SEQ ID N0:131), a transmembrane domain which extends from about amino 
acid 170 to about amino acid 193 of SEQ ID NO:26 (SEQ ID NO: 130), and a cytoplasmic 
domain which extends from about amino acid 20 to about amino acid 169 of SEQ ID 
1^ NO:26 (SEQIDNO:129). 

Human TANGO 354 includes an immimoglobulin domain at amino acids 33-1 10 of 
SEQ ID NO:26 (SEQ ID N0:41). Figure 23 depicts ahgnments of the immunoglobulin 
domaias of TANGO 354 with consensus hidden Markov model immunoglobulin domains 
(SEQIDNO:37). 

An N-glycosylation site is present at amino acids 88-91 of SEQ ID NO:26. A 
cAMP and cGMP-dependent protein kinase phosphorylation site is present at amino acids 
233-236 of SEQ ID NO:26. Protein kinase C phosphoiylation sites are present at amino 
acids 81-83, 231-233, and 236-238 of SEQ ID NO:26. Casern kinase H phosphorylation 
sites are present at amino acids 44-47, 69-72, 81-84, 94-.97, 101-104, 113-116, and 146-149 
of SEQIDNO:26* A tyrosine Idnase phosphorylation site is present at amino acids 291- 
299 of SEQ ID NO:26. N-myristylation sites are present at amino acids 30-35, and 109-1 14 
ofSEQIDNO:26. 

Clone jthLa042a04, which encodes human TANGO 354, was deposited as EpT354 
witli the American Type Culture Collection (ATCC® 10801 University Boulevard, 
2^ Manassas, VA 201 10-2209) on June 18, 1999 and assigned Accession Number PTA-249. 
This deposit will be maintained under the terms of the Budapest Treaty on the Ihtemational 
Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This 
deposit was made merely as a convenience for those of skill in the art and is not an 
admission that a deposit is required under 35 U.S.C. § 1 12. 

Figure 22 depicts a hydropathy plot of hxnnan TANGO 354. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydi^ophilic regions are below the 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 22 indicates the presence of a 
hydrophobic domain within human TANGO 354, suggesting that human TANGO 354 is a 
transmembrane protein. 
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Use of TANGO 354 Nucleic Acids, Polypeptides, and Modulators Thereof 

TANGO 354 includes an immunoglobulin-like domain. Proteins having such 
domains play a role in mediating protein-protein and protein-ligand interactions, and thus 
can influence a wide variety of biological processes, including modulation of cell surface 
recognition; modulation of cellular motility, e.g., chemotaxis and chemokinesis; 
^ transduction of an extracellular signal (e.g. , by interacting with a ligand and/or a cell- 
surface receptor); and/or modulation of a signal transduction pathways. 

TANGO 354 polypeptides, nucleic acids, and modulators thereof can be used to 
modulate function, survival, morphology, migration, proliferation and/or differentiation of 
cells in the tissues in which it is expressed (e.g, hematopoietic tissues). 

Because of the presence of an immunoglobulin domain and the expression of 
TANGO 354 in hematopoietic cells, TANGO 354 polypeptides, nucleic acids, and 
modulators thereof can be used to modulate (e.g., increase or decrease) hematopoietic 
function, thereby influencing one or more of: (1) regulation of hematopoiesis; (2) 
modulation of haemostasis; (3) modulation of an ittflammatory response; (4) modidation of 
neoplastic growth, e.g., inhibition of tumor growth; and/or (5) regulation of thrombolysis. 

Accordingly, TANGO 354 polypeptides, nucleic acids, and modulators thereof can 
be used to treat a variety of hematopoietic diseases including, but not limited to, myeloid 
disorders and/or lymphoid malignancies. Exemplary myeloid diseases that can be treated 
include acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and 

on 

chrome myelogenous leukemia (CML) (reviewed in Vaickus, 1991, CritRev. in 
Oncol/Hemotol 11:267-97). Exemplary lymphoid malignancies that can be treated using 
these molecules include acute lymphoblastic leukemia (ALL) which includes B-lineage 
ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia 
(PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). 

25 

Additional forms of malignant lymphomas include non-Hodgkin lymphoma and variants 
thereof, peripheral T cell lymphomas, adult T ceirieukemia/lymphoma (ATL), cutaneous T- 
cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF) and Hodgkin's disease. 

In one embodiment, TANGO 354 polypeptides, nucleic acids, and modulators 
thereof can be used to treat a variety of neoplastic diseases, including malignancies of the 

30 

vanous organ systems, such as affecting lung, breast, lymphoid, gastrointestinal, and 
genito-urinary tract, as well as adenocarcinomas which include malignancies such as most 
colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell 
carcinoma of the lung, cancer of the small intestine and cancer of the esophagus. 

The term "carcinoma" is art recognized and refers to malignancies of epithelial or 

35 

endocrine tissues including respiratory system carcinomas, gastrointestinal system 
carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, 
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prostatic carcinomas, endocrine system carcinomas, and melanomas- Exemplary carcinomas 
include those foiming ftom tissue of the cervix, lung, prostate, breast, head and neck, colon 
and ovaiy. The term also includes carcinosarcomas, e.g., which include malignant tumors 
. composed of carcinomatous and sarcomatous tissues. An "adenocarcinoma" refers to a 
carcinoma derived jBrom glandular tissue or in which the tumor cells form recognizable 
glandular structures. The term "sarcoma" is art recognized and refers to malignant tumors 
of mesenchymal derivation. 

TANGO 354 polypeptides, nucleic acids, and modulators thereof can also be used to 
treat a variety of non-cancerous diseases or conditions involving, for example, aberrant T 
cell activity aberrant T cell proliferation and/or secretion)* Examples of such T cell 
diseases or conditions include inflammation; allergy, for example, atopic allergy; organ 
rejection after transplantation (e.g., skin graft, cardiac graft, islet graft); graft-versus-host 
disease; autoimmune diseases (including, for example, diabetes mellitus, arthritis (including 
rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), 
multiple sclerosis, encephalomyelitis, diabetes, myasthenia gravis, systemic lupus 
erythematosus, autoimmune thyroiditis, dermatitis (including atop^ 
eczematous dermatitis), psoriasis, Sj6gren*s Syndrome, including keratoconjunctivitis sicca 
secondary to SjQgren^s Syndrome, alopecia areata, allergic responses due to arthropod bite 
reactions, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, 
ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, 
vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, 
autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic 
encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, 
pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's 
granulomatosis, chronic active hepatitis, Stevens- Johnson syndrome, idiopathic sprue, 
lichen planus, Crohn's disease, Graves ophthalmopathy, sarcoidosis, primary biliary 
cirrhosis, uveitis posterior, and interstitial limg fibrosis). 

Further, in light of TANGO 345*s presence in a Mixed Lymphocyte Reaction cDNA 
library, TANGO 345 expression can be utilized as a marker for specific tissues (e.g., 
lymphoid tissues such as the thymus and spleen) and/or cells (e.g,, lymphocytes) in which 
-^^ TANGO 345 is expressed. TANGO 345 nucleic acids can also be utilized for chromosomal 
mapping. 

TANq0 378 

A cDNA encoding hmnan TANGO 378 was identified by analyzing the sequences 
of clones present in a human natural killer cell cDNA library. 
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This analysis led to the identification of a clone, jthta028f04, encoding full-length 
human TANGO 378, The cDNA of this clone is 3258 nucleotides long (Figures 24A-24C; 
SEQ ID NO:28). The 1584 nucleotide open reading jframe of this cDNA^ nucleotides 42 to 
1625 of SEQ ID NO:28 (SEQ ID NO:30), encodes a 528 amino acid protein (Figure 25; 
SEQIDNO:29). 

The signal peptide prediction program SIGNALP (Nielsen et aL, 1997, Protein 
Engineering 10:1-6) predicted that human TANGO 378 includes a 21 amino acid signal 
peptide at amino acid 1 to about amino acid 21 of SEQ ID NO:29 (SEQ ID NO:132) 
preceding the mature human MANGO 347 protein which corresponds to about amino acid 
22 to amino acid 528 of SEQ ID NO:29 (SEQ ID NO:133). 

Human TANGO 378 that has not been post-translationally modified is predicted to 
have a molecular weight of 59.0 kDa prior to cleavage of its signal peptide and a molecular 
weight of 56,7 kDa subsequent to cleavage of its signal peptide. 

Human TANGO 378 is a seven transmembrane G-protein coupled receptor (GPCR) 
protein having an N-terminal extracellular domain which extends from about amino acid 22 
to about amino acid 244 of SEQ ID NO:29 (SEQ ID NO: 134); seven transmembrane 
domains which extend from about amino acids 245 to about amino acid 269 of SEQ ID 
NO:29 (SEQ ID NO: 135), about amino acids 287 to about amino acid 306 of SEQ ID 
NO:29 (SEQ ID NO: 136), about amino acids 323 to about amino acid 343 of SEQ ID 
NO:29 (SEQ ID NO:137), about amino acids 358 to about amino acid 376 of SEQ ID 
NO:29 (SEQ ID NO:138), about amino acids 414 to about amino acid 438 of SEQ ID 
NO:29 (SEQ ID NO:139), about amino acids 457 to about amino acid 477 of SEQ ID 
NO:29 (SEQ ID NO:140), and about amino acids 485 to about amino acid 504 of SEQ ID 
NO:29 (SEQ ID NO: 141); and a C-tenninal cytoplasmic domain which extends from about 
amino acid 505 to amino acid 528 of SEQ ID NO:29 (SEQ ID NO: 142). Figure 26 depicts 
m alignment of each of the transmembrane domains of TANGO 378 with the consensus 
hidden Markov model seven transmembrane receptor sequences (SEQ ID NO:98). 

Alternatively, in another embodiment, a human TANGO 378 protein contains an N- 
terminal extracellular domain which extends from about amino acid 505 to amino acid 528 
of SEQ ID NO:29 (SEQ ID NO: 142); seven transmembrane domains which extend from 
about amino acids 245 to about amino acid 269 of SEQ ID NO:29 (SEQ ID NO:135), about 
amino acids 287 to about amino acid 306 of SEQ ID NO:29 (SEQ ID NO:136), about 
amino acids 323 to about amino acid 343 of SEQ ID NO:29 (SEQ ID NO:137), about 
amino acids 358 to about amino acid 376 of SEQ ID NO:29 (SEQ ID NO:138), about 
amino acids 414 to about amino acid 438 of SEQ ID NO:29 (SEQ ID NO:139), about 
amino acids 457 to about amino acid 477 of SEQ ID NO:29 (SEQ ID NO: 140), and about 
amino acids 485 to about amino acid 504 of SEQ ID NO:59 (SEQ ID NO: 141); and a C- 
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temiinal cytoplasmic domain which extends jBrom ahout amino acid 22 to about amino acid 
244 of SEQ ID NO:29 (SEQ ID NO:134). 

Himiati TANGO 378 includes three extracellular loops which extend from about 
amino acid 307 to about amino acid 322 of SEQ ID NO:29 (SEQ ID NO: 143), about amino 
acid 377 to about amino acid 413 of SEQ ID NO:29 (SEQ ID NO:144), and about amino 
acid 478 to about amino acid 484 of SEQ ID NO:29 (SEQ ID NO:145). 

Human TANGO 378 includes three intracellular loops which extend from about 
amino acid 270 to about amino acid 286 of SEQ ID NO:29 (SEQ ID NO:146), about amino 
acid 344 to about amino acid 357 of SEQ ID NO:29 (SBQ ID NO:147), and about amino 
acid 439 to about amino acid 456 of SEQ ID NO:29 (SEQ ID NO:148), 

N-glycosylation sites are present at amino acids 18-21, 58-61, 65-68, 146-149, 173- 
176, 179-182, 394-397, and 400-403 of SEQ ID NO:29. A cAMP and cGMP-dependent 
protein kinase phosphorylation site is present at amino acids 274-277 of SEQ ID NO:29. 
Protein kinase C phosphorylation sites are present at amino acids 45-47, 93-95, 375-377, 
437-439, 449-451, and 505-507 of SEQ ID NO:29. Casein kinase 11 phosphorylation sites 
are present at amino acids 23-26, 29-32, and 510-513 of SBQ ID NO:29, N-myristylation 
sites are present at amino acids 86-91, 101-106, 157-162, 255-260, 311-316, 420-425, and 
467-472 of SEQ ID NO:29. A thiol (cysteine) protease histidine site is present at amino 
acid 410-420 of SEQ ID NO:29. 

Clone jthta028fb4, which encodes human TANGO 378, was deposited as EpT378 
with the American Type Culture Collection (ATCC® 10801 University Boulevard, 
Manassas, VA 201 10-2209) on June 18, 1999 and assigned Accession Number PTA-249. 
This deposit will be maintained under the terms of the Budapest Treaty on the International 
Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This 
deposit was made merely as a convenience for those of skill in the art and is not an 
admission that a deposit is required under 35 U.S*C« § 1 12* 

Figure 25 depicts a hydropathy plot of human TANGO 378. Relatively hydrophobic 
regions are above the horizontal line, and relatively hydrophilic regions are below liie 
horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below 
the hydropathy trace. The hydropathy plot of Figure 25 indicates that human TANGO 378 
has a signal peptide at its amino terminus and seven hydrophobic domains within human 
TANGO 378, suggesting that human TANGO 378 is a transmembrane protein. 

Use of TANGO 378 Nucleic Acids. Polypeptides, and Modulators Thereof 

TANGO 378 includes a seven transmembrane domain which is typically found in 
G-protein coupled receptors. Proteins having such a domain play a role in transducing an 
extracellular signal, e.g., by interacting with a ligand and/or a cell-surface receptor, 
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followed by mobilization of intracellular molecules that participate in signal transduction 
pathways adenylate cyclase, orphosphatidylinositol 4,5-bisphosphate Q?IP^, inositol 
1,4,5-triphosphate (IP3)). 

TANGO 378 polypeptides, nucleic acids, and modulators thereof can be used to 
modulate function, survival, morphology, migration, proliferation and/or differentiation of 
cells in the tissues in which it is expressed (e.g., natural killer cells). For example, TANGO 
354 polypeptides, nucleic acids, and modulators thereof can be used to modulate an immime 
response in a subject by, for example, (1) modulating immune cj^otoxic responses against 
pathogenic organisms, e.g., viruses, bacteria, and parasites; (2) by modulating organ 
rejection after transplantation (e.g,, skin graft, cardiac graft, islet graft); (3) by modulating 
immune recognition and lysis of normal and malignant cells; (4) by modulating T cell 
diseases; and (5) by controlling neoplastic growth, ag., inhibition of tumor growth. 

Accordingly, TANGO 378 polypeptides, nucleic acids, and modulators thereof can 
be used to treat a variety of diseases involving aberrant immune responses, for example, 
aberrant T cell activity (e»g., aberrant T cell proliferation and/or secretion). A non-limiting 
list of diseases involving aberrant T cell activity is provided in the section entitled 
"TANGO 354" above. 

In other embodiments, TANGO 378 polypeptides, nucleic acids, and modulators 
thereof can be used to treat a variety of neoplastic diseases, including hematopoietic 
malignancies and including, but not limited to, myeloid disorders, lymphoid malignancies, 
and/or malignancies of the various organ systems. ). A non-limiting list of such neoplastic 
diseases is provided in the section entitled "TANGO 354" above. 

Further, in light of TANGO 378's presence in a Natral Killer cell cDNA library, 
TANGO 378 expression can be utilized as a marker for specific tissues (e.g., lymphoid 
tissues such as the thymus and spleen) and/or cells (e.g, Natural Killer cells) in which 
TANGO 345 is expressed. TANGO 345 nucleic acids can also be utilized for chromosomal 
mapping. 
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Tables 1 and 2 below provide sunamaries of INTERCEPT 340, MANGO 003, 
MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 sequence 
information. 



TABLE 1 : Summary of Sequence Information for INTERCEPT 340, MANGO 003, 
MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 



10 



15 



20 



25 



Gene 


cDNA 


OKJb 


Polypeptide 


Figure 


Accession 
Number 


INTBRCEPT340 
human 


SEQIDNOil 


SEQIDN0i3 


SBQIDNO:2 


Figs. lA-lB 


PTA-250 


MANGO 003 
human 


SEQIDN0:4 


SEQIDN0:6 


SEQIDN0:5 


Figs. 4A-4C 


207178 


MANGO 003 
mouse 


SEQIDNO:7 


SEQlDNO:9 


SEQlDNO:8 


Fig. 8 




MANGO 347 

human 


SfiQIDNO:10 


SEQIDN0:12 


SEQIDNOrll 


Fig. 10 


PTA-250 


TANGO 272 
human 


SEQIDNO:13 


SEQ ID NO: 15 


SEQ ID NO: 14 


Figs. 13A-13D 


PTA-250 


TANGO 272 
mouse 


SEQIDN0:16 


SEQIDN0:18 


SEQIDN0:17 


Figs.l6A-16B 




TANGO 272 
rat 


SEQIDN0:19 


SEQ ID N0:21 


SEQIDNO:20 


Figs. 33A-33C 




TANGO 295 
human 


SEQIDNO:22 


SEQIDNOi24 


gEQIDNOi23 


Fig. 18 


PTA-249 


TANGO 354 

human 


SEQ ID NO:25 


SEQrDNO:27 


SEQIDNO:26 


Figs. 21A-21B 


PTA-249 


TANGO 378 
human 


SEQIDNO:28 


SEQK)NO:30 


SEQIDNO:29 


Figs. 24A-24C 


FrA-249 
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TABLE 2: Suimnary of Protein Domains of INTERCEPT 340, MANGO 003, 

MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 





Protein 


Signal 
Peptide 


Mature 
Protein 


KxtraceUuiar 
Domain 


Transmembrane 
Domain 


Cytoplasmic 
Domain 


.5 


INTBRCBPT 340 

human 














MANGO 003 


AAl-24of 
SEQIDNOnOl 


AA 25-504 of 

SEQIDNO:102 


AA 25-374 of 

SEQIDNO:103 


AA 375-398 of 
SEQ ID NO: 104 


AA 399-504 of 

oiivj iu ino:d 
SEQ ID NO: 105 


10 


MANGO 003 
mouse 




AA 1-208 of 

oxjv^liJ JNU.o 
SEQIDNO:106 


AA 1-73 of 

oiiQ ID NUlo 
SEQIDNO:107 


AA 74-96 of 
SEQ ID NO: 108 


AA 97-208 of 

oEQ ID NO: 8 
SEQIDNO:109 
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SEQIDNO:20 
SEQ ID NO: 121 
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SBQIDNO:20 
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SBQIDNO:20 
SBQIDNO:123 


AA 525-636 of 
SEQIDNO:20 
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human 
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SEQ ID NO:23 
SEQIDNO:125 
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SEQIDNO:23 
SEQ ID NO: 126 
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SEQ ID NO: 128 
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SEQIDNO:26 
SEQ ID NO: 129 
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SEQ ID NO: 131 
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TABLE 2 continued 
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Various aspects of the invention are described in further detail in the following 
25 subsections 

1. Isolated Nucleic Acid Molecules 

One aspect of the invention pertains to isolated nucleic acid molecules that encode a 
polypeptide of the invention or a biologically active portion thereof, as well as nucleic acid 

30 molecules sufficient for use as hybridization probes to identify nucleic acid molecules 
encoding a polypeptide of the invention and fragments of such nucleic acid molecules 
suitable for use as PGR primers for the amplification or mutation of nucleic acid molecules. 
As used herein, the term "nucleic acid molecule" is intended to include DNA molecules 
{e.g.^ cDNA or genomic DNA) and KNFA molecules {e.g,, mRNA) and analogs of the DNA 

35 or RNA generated using nucleotide analogs. The nucleic acid molecule can be single- 
stranded or double-stranded, but preferably is double-stranded DNA. 
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An "isolated" nucleic acid molecule is one which is separated fi:om other nucleic 
acid molecules which are present in flie natural source of the nucleic acid molecule. 
Preferably, ah "isolated" nucleic acid molecule is free of sequences (preferably protein 
encoding sequences) which naturally flank the nucleic acid (i»e., sequences located at the 5* 
and 3 Vends of the nucleic acid) in the genomic DNA of the organism from which the 
^ nucleic acid is derived. In other embodiments, the "isolated" nucleic acid is free of intron 
sequences. For example, iu various embodiments, the isolated nucleic acid molecule can 
contain less than about 5 kB, 4 kB, 3 kB, 2 kB, 1 kB, 0,5 kB or 0 J kB of nucleotide 
sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from 
which the nucleic acid is derived. Moreover, an "isolated" nucleic acid molecule, such as a 

10 

cDNA molecule, can be substantially free of other cellular material, or culture medium 
when produced by recombinant techniques, or substantially free of chemical precursors or 
other chemicals when chemically synthesized. In one embodiment, the nucleic acid 
molecules of the iavention comprise a contiguous open reading frame encoding a 
polypeptide of the invention. 

A nucleic acid molecule of the present invention, e,g., a nucleic acid molecule 
having the nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 
21, 22, 24, 25, 27, 28 or 30, or a complement thereof, can be isolated using standard 
molecular biology techniques and the sequence information provided herein. Using all or a 
portion of the nucleic acid sequences of SEQID NOs;l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 
2^ 19, 21, 22, 24, 25, 27, 28 or 30 as a hybridization probe, nucleic acid molecules of the 
iavention can be isolated using standard hybridization and cloning techniques (e.g., as 
described in Sambrook et al., eds., Molecular Cloning: A Laboratory Mmiual, 2nd 
ed.,1989, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY). 

25 

A nucleic acid molecule of the invention can be ampHfied using cDNA, mRNA or 
genomic DNA as a template and appropriate oligonucleotide primers according to standard 
PCR amplification techniques. The nucleic acid so amplified can be cloned into an 
appropriate vector and characterized by DNA sequence analysis. Furthermore, 
oligonucleotides correspondhig to all or a portion of a nucleic acid molecule of the 

30 

mvention can be prepared by standard synthetic techniques, e.g, using an automated DNA 
synthesizer. 

In another preferred embodiment, an isolated nucleic acid molecule of the iavention 
comprises a nucleic acid molecule which is a complement of the nucleotide sequence of 
SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a 

or 

portion thereof A nucleic acid molecule which is complementary to a givea nucleotide 
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sequence is one whicH is sufficiently complementary to the given nucleotide seiqtuence that 
it can hybridize to the given nucleotide sequence therehy forming a stable di:5>lex. 

Moreover, a nucleic acid molecule of the invention can comprise only a portion of a 
nucleic acid sequence encoding a full length polypeptide of the invention for example, a 
fragment which can be used as a probe or primer or a fragment encoding a biologically 
active portion of a polypeptide of the invention. The nucleotide sequence determined from 
the cloning one gerie allows for the generation of probes and primers designed for use in 
identifying and/or cloning homologues in other cell types, e,g.^ from other tissues, as well as 
homologues from other mammals* The probe/primer typically comprises substantially 
purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide 
sequence that hybridizes under stringent conditions to at least about 12, preferably about 25, 
more preferably about 50, 75, 100, 125, 150, 175, 200, 250, 300, 350 or 400 consecutive 
nucleotides of the sense or anti-sense sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 
15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or of a naturally occurring mutant of SEQ ID 
NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30. 

Probes based on the sequence of a nucleic acid molecule of Ihe invention can be 
used to detect transcripts or genomic sequences encoding the same protein molecule 
encoded by a selected nucleic acid molecule. The probe comprises a label group attached 
thereto, a radioisotope, a fluorescent compound, an enzyme, or an enzytne co-factor. 
Such probes can be used as part of a diagnostic test kit for identifying cells or tissues which 
mis-express the protein, such as by measuring levels of a nucleic acid molecule encoding 
the protein in a sample of cells from a subject, e,g., detecting mRNA levels or determining 
whether a gene encoding the protein has been mutated or deleted. 

A nucleic acid fragment encoding a biologically active portion of a polypeptide of 
the invention can be prepared by isolating a portion of any of SEQ ID NOs:l , 3, 4, 6, 7, 9, 
10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, expressing the encoded portion of the 
polypeptide protein (e.g., by recombinant expression in vitro) and assessing the activity of 
the encoded portion of the polypeptide. 

The invention further encompasses nucleic acid molecules that differ from the 
nucleotide sequence of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 
25, 27, 28 or 30, due to degeneracy of the genetic code and thus encode the same protein as 
that encoded by the nucleotide sequence SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 
19, 21, 22, 24, 25, 27, 28 or 30. 

IaadditiontothenucleotidesequencesofSEQIDNOs:l,3,4,6,7,9, 10, 12, 13, 
15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, it will be appreciated by those skilled in the art 
that DNA sequence polymorphisms that lead to changes in the anaino acid sequence may 
exist within a population (e.g,, the human population). Such genetic polymorphisms may 
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exist among individuals within a population due to natural allelic variation. An allele is one 
of a group of genes which occur alternatively at a given genetic locus. As used hereha, the 
phrase "allelic variant" refers to a nucleotide sequence which occurs at a given locus or to a 
polypeptide encoded by the nucleotide sequence. As used herein, the terms "gene" and 
"recombinant gene" refer to nucleic acid molecules comprising an open reading frame 

^ encoding a polypeptide of the invention. Such natural allelic variations can typically result 
in 1-5% variance in the nucleotide sequence of a given gene. Alternative alleles can be 
identified by sequencing the gene of interest in a number of different individuals. This can 
be readily carried out by using hybridization probes to identify the same genetic locus in a 
variety of individuals. Any and all such nucleotide variations and resulting amino acid 

^ ^ polymorphisms or variations that are the result of natural allelic variation and that do not 
alter the functional activity are intended to be within the scope of the invention. 

Moreover^ nucleic acid molecules encoding proteins of the invention from other 
species (homologues), which have a nucleotide sequence which differs from that of the 
human protein described herein are intended to be within the scope of the invention. 
Nucleic acid molecules corresponding to natural allelic variants and homologues of a cDNA 
of tlie invention can be isolated based on their identity to the human nucleic acid molecule 
disclosed herein using the human cDNAs, or a portion thereof, as a hybridization probe 
according to standard hybridization techniques under stringent hybridization conditions. 
For example, a cDNA encoding a soluble form of a membrane-boimd protein of the 
invention isolated based on its hybridization to a nucleic acid molecule encoding all or part 
of the membrane-bound form. Likewise, a cDNA encoding a membrane-boimd form can 
be isolated based on its hybridization to a nucleic acid molecule encoding all or part of the 
soluble form. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 
25 invention is at least 300 (325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 800, 900, 
1000, or 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 
3800, 4000, or 4200) nucleotides in length and hybridizes under stringent conditions to the 
nucleic acid molecule comprising the nucleotide sequence^ preferably the coding sequence, 
of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a 
complement thereof. 

As used herein, the term "hybridizes under stringent conditions" is intended to 
describe conditions for hybridization and washing under which nucleotide sequences at 
least 60% (65%, 70%, preferably 75%) identical to each other typically remain hybridized 
to each other. Such stringent conditions are known to those skilled in the art and can be 
found in Current Protocols in Molecular Biology, 1989, John Wiley & Sons, NY, sections 
6.3.1-6.3.6. A preferred, non-limiting example of stringent hybridization conditions are 
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hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45 C, followed by one or 
more washes in 0.2 X SSC, 0*1% SDS at 50-65 Preferably, an isolated nucleic acid 
molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ 
ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28 or 30, or a 
complement thereof, corresponds to a naturally-occurring nucleic acid molecule. As used 
herein, a "naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule 
having a nucleotide sequence that occurs in nature (e.g. , encodes a natural protein). 

In addition to naturally-occurring allelic variants of a nucleic acid molecule of the 
invention sequence that may exist in the population, the skilled artisan will further 
appreciate that changes can be introduced by mutation thereby leading to changes in the 
amino acid sequence of the encoded protein, without altering the biological activity of the 
protein. For example, one can make nucleotide substitutions leading to amino acid 
substitutions at "non-essential" amino acid residues. A "non-essential" amino acid residue 
is a residue that can be altered from the wild-type sequence without altering the biological 
activity, whereas an "essential" amino acid residue is required for biological activity. For 
example, amino acid residues that are not conserved or only semi-conserved among 
homologues of various species may be non-essential for activity and thus would be likely 
targets for alteration. Alternatively, amino acid residues that are conserved among the 
homologues of various species (e.g., murine and human) may be essential for activity and 
thus would not be likely targets for alteration. 

Accordingly, another aspect of the invention pertains to nucleic acid molecules 
encoding a polypeptide of the invention that contain changes in amino acid residues that are 
not essential for activity. Such polypeptides differ in amino acid sequence from SEQ ID 
N0s:2, 5,8,11, 14, 17, 20, 23, 26, or 29, yet retain biological activity. In one embodiment, 
the isolated nucleic acid molecule includes a nucleotide sequence encoding a protein that 
mcludes an amino acid sequence that is at least about 45% identical, 65%, 75%, 85%, 95%, 
or 98% identical to the amino acid sequence of SBQ ID NOs:2, 5, 8, 1 1, 14, 17, 20, 23, 26, 
or 29. 

An isolated nucleic acid molecule encoding a variant protein can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the nucleotide 
sequence of SEQ ID N0s:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, such that one or more ammo 
acid substitutions, additions or deletions are introduced into the encoded protein. Mutations 
can be introduced by standard techniques, such as site-directed mutagenesis and PCR- 
mediated mutagenesis. Briefly, PGR primers are designed that delete the trinucleotide 
codon of the amino acid to be changed and replace it with the trinucleotide codon of the 
amino acid to be included. This primer is used in the PGR amplification of DNA encoding 
the protein of interest. This fragment is then isolated and inserted into the ftill length cDNA 
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encoding the protein of interest and expressed recombinantly. The resulting protein now 
includes the amino acid replacement. 

Preferably, conservative amino acid substitutions are made at one or more predicted 

non-essential amino acid residues. Conservative replacements are those that take place 
within a family of amino acids that are related in their side chains. Genetically encoded 

^ amino acids are can be divided into four families: (1) acidic - aspartate, glutamate; (2) basic 
- lysine, arginine, histidine; (3) nonpolar == alanine, valine, leucine, isoleucine, proline, 
phenylalanine, methionine, tryptophan; and (4) uncharged polar = glycine, asparagine, 
glutamine, cysteine, serine, threonine, tyrosine. In similar fashion, the amino acid repertoire 
can be grouped as (1) acidic = aspartate, glutamate; (2) basic =^ lysine, arginine histidine, (3) 
aliphatic = glycine, alanine, valine, leucine, isoleucine, serine, threonine, with serine and 
threonine optionally be grouped separately as aliphatic-hydroxyl; (4) aromatic = 
phenylalanine, tyrosine, tryptophan; (5) amide - asparagine, glutamine; and (6) sulfur - 
containing = cysteine and methionine. {See, for example, Biochemistry, 4th ed., Ed. by L. 
Stryer, WH Freeman and Co.: 1995). 

^ ^ Alternatively, mutations can be introduced randomly along all or part of the coding 

sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 
biological activity to identify mutants that retain activity. Following mutagenesis, the 
encoded protein can be expressed recombinantly and the activity of the protein can be 
•determined. 

10 

In a preferred embodiment, a mutant polypeptide that is a variant of a polypeptide of 
the invention can be assayed for: (1) the ability to form protein-protein interactions v^th 
proteins in a signaling pathway of the polypeptide of the invention; (2) the ability to bind a 
ligand of the polypeptide of the invention; or (3) the ability to bind to an intracellular target 
protein of the polypeptide of the invention. In yet another preferred embodiment, the 
mutant polypeptide can be assayed for the ability to modulate cellular proliferation, cellular 
migration or chemotaxis, or cellular differentiatipn. 

The present invention encompasses antisense nucleic acid molecules, i.e., molecules 
which are complementary to a sense nucleic acid encoding a polypeptide of the invention, 

complementary to the coding strand of a double-stranded cDNA molecule or 
complementary to an mRNA sequence. Accordingly, an antisense nucleic acid can 
hydrogen bond to a sense nucleic acid. The antisense nucleic acid can be complementary to 
an entire coding strand, or to only a portion thereof, e.g., ail or part of the protein coding 
region (or open reading frame). An antisense nucleic acid molecule can be antisense to all 
or part of a non-coding region of the coding strand of a nucleotide sequence encoding a 
polypeptide of the invention. The non-coding regions ("5* and 3* untranslated regions") are 
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the 5' and 3' sequences which flank the coding region and are not translated into amino 



An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 
45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis and enzymatic Kgation reactions using procedures 
^ known in the art. For example, an antisense nucleic acid (ag., an antisense oligonucleotide) 
can be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. Examples of 
modified nucleotides which can be used to generate the antisense nucleic acid include 5- 
fluoroxuracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4- 
acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylamiuomethyl-2- 
thiouridine, 5-carboxymethylaniinomethyluracil, dihydrouracil, p-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanhie, 

2- methyladenine, 2-methylguardne, 3-methylcytosine, 5-methylcytosine, rsf6-adenine, 7- 
methylguanine, 5-methylaminomethyluracil, 5-'methoxyaminomethyl-24hiouracil, P-D- 
mannosylqueosine, 5 -methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6- 
isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosiae, 2- 
thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyltiracil, uracil-5- 
oxyacetic acidmethylester, uracil-5-oxyacetic acid (v), 5-methyl-2-fhioiiracil, 3-(3-amino- 

3- N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense 
nucleic acid can be produced biologically using an expression vector into which a nucleic 
acid has been subcloned in an antisense orientation (i.e., RNA transcribed firom the inserted 
nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described 
further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mKNA and/or 
genoinic DNA encoding a selected polypeptide of the invention to thereby inhibit 
expression, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

OA 

^ conventional nucleotide complementarity to form a stable duplex, or, for example, m the 
case of an antisense nucleic acid molecule which binds to DNA duplexes, through specific 
interactions in the major groove of the double helix. An example of a route of 
administration of antisense nucleic acid molecules of the invention includes direct injection 
at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target 
selected cells and then adniinistered systeniically. For example, for systemic 
administration, antisense molecules can be modified such that they specifically bind to 
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receptors or antigens expressed on a selected cell surface, e,g., by linking the antisense 
nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or 
antigens. The antisense nucleic acid molecules can also be delivered to cells using the 
vectors described herein* To achieve sufficient intracellular concentrations of the antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under 
^ the control of a strong pol nor poim promoter are preferred. 

An antisense nucleic acid molecule of the invention can be an a-anomeric nucleic 
acid molecule. An a-anomeric nucleic acid molecule forms specific double-stranded 
hybrids with complementary RNA in which, contrary to the usual P-units, the strands run 
parallel to each other (Gaultier et al., 1987, Nucleic Acids Res, 15:6625-41). The antisense 
nucleic acid molecule can also comprise a 2 -o-methylribonucleotide (Inoue et al.,1987, 
Nucleic Acids Res. 15:6131-48) or a chimeric RNA-DNA analogue (Inoue et al., 1987, 
FEBS Lett 215:327-30). 

The invetition also ^compasses ribozymes. Ribozymes are catalytic RNA 
molecules with ribonuclease activity which are capable of cleaving a single-stranded nucleic 
^ ^ acid, such as an mKNA, to which they have a complementary region. Thus, ribozymes 
(e.g., hammerhead ribozymes; described in Haselhoff and Gerlach, 1988, Nature 334:585- 
91) can be used to catalytically cleave mRNA transcripts to thereby inhibit translation of the 
protein encoded by the noiRNA. A ribozyme having specificity for a nucleic acid molecule 
encoding a polypeptide of the invention can be designed based upon the nucleotide 

20 

sequence of a cDNA disclosed herein. For example, a derivative of a Tetrahymena L- 1 9 
IVS RNA can be constructed in which the nucleotide sequence of the active site is 
complementary to the nucleotide sequence to be cleaved in a Cech et al. U.S. Patent No. 
4,987,071 ; and Cech et aL U.S. Patent No. 5,1 16,742. Alternatively, an mRNA encoding a 
polypeptide of the invention can be used to select a catalytic RNA having a specific 
ribonuclease activity firom a pool of RNA molecules. See, e.^., Bartel and Szostak, 1993, 
Science 261:1411-8. 

The invention also encompasses nucleic acid molecules which form triple helical 
structures. For example, expression of a polypeptide of the invention can be inhibited by 
targeting nucleotide sequences complementaiy to the regulatory region of the gene 

on 

encoding the polypeptide (e.g., the promoter arid/or enhancer) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally Helene, 1991, 
Anticancer Drug Des. 6(6):569-84; Helene, 1992,^m NY. Acad. ScL 660:27-36; and 
Maher, 1992, Bioassays 14(12):807-15. 

In various embodiments, the nucleic acid molecules of the invention can be 
modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g. , the 
stability, hybridization, or solubility of the molecule. For example, the deoxyribose 
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phosphate backbone of the nucleic acids can be modified to generate peptide nucleic acids 

(see Hyrup et aL, 1996, Bioorganic & Medicinal Chemistry 4(1): 5-23). As used herein, the 
terms "peptide nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in 
which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and 
only the four natural nucleobases are retained. The neutral backbone of PNAs has been 
^ shown to allow for specific hybridization to DNA and RNA under conditions of low ionic 
strength. The synthesis of PNA oligomers can be performed using standard solid phase 
peptide synthesis protocols as described ia Hyrup et al., 1996, supra; Perry-O'Keefe et al, 
1996, Proc. Natl Acad Sol USA 93:14670-5. 

PNAs can be used in therapeutic and diagnostic applications. For example, PNAs 
can be used as antisense or antigene agents for sequence-specific modulation of gene 
expression by, e,g.^ inducing transcription or translation arrest or inhibiting replication. 
PNAs can also be used, e.g.^ in the analysis of single base pair mutations in a gene by, e.g., 
PNA directed PGR clamping; as artificial restriction etaymes when used in combination 
with other enzymes, eg., SI nucleases CHyrup, 1996, ^wpr^); or as probes or primers for 
DNA sequence and hybridization (Hyrup, 1996, supra\ Perry-O'Keefe et al., 1996, Proc. 
Natl Acad ScL USA 93:14670-675). 

In another embodunent, PNAs can be modified, e.g., to enhance their stability or 
cellular uptake, by attaching lipophilic or other helper groups to PNA, by the formation of 
PNA-DNA chimeras, or by the use of liposomes or other techniques of drug delivery known 
in the art. For example, PNA-DNAchiraeras can be generated which may conibine the 
advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e,g, , RNase H and DNA polymerases, to interact with the DNA portion while the 
PNA portion would provide high binding affinity and specificity. PNA-DNA chimeras can 
be linked using luikers of appropriate lengths selected in terms of base stacking, number of 
bonds between the nucleobases, and orientation (Hyrup, 1996, supra). The synthesis of 
PNA-rDNA chimeras can be performed as described in Hyrup (1996, suprd) and Fian et al. 
(1996, Nucleic Acids Res. 24(17):3357-63). For example, a DNA chain can be synthesized 
on a solid support using standard phosphoramidite coupling chemistry and modified 
nucleoside analogs. Compounds such as 5 -(4-methoxytrityl)amino-5'-deoxy-thymidine 
phosphoramidite can be used as a link between the PNA and the 5' end of DNA (Mag et aL, 
1989, Nucleic Acids Res. 17:5973-88). PNA monomers are then coupled in a stepwise 
manner to produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment 
(Fimi et aL, 1996, Nucleic Acids Res, 24(17):3357-63). Alternatively, chimeric molecules 
can be synthesized with a 5' DNA segment and a 3' PNA segment (Peterser et al., 1975, 
Bioorganic Med Chem. LeU. 5:11194124). 
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Jxi other embodiments, the oligonucleotide may include other appended groups such 
as peptides (e.g*., for targeting host cell receptors in vivo), or agents facilitating transport 
across the cell membrane (see, e.g. , Letsinger et al., 1989, Proa Natl Acad, Set USA 
86:6553-6; Lemaitre et al., 1987, Proa Natl Acad Sci. USA 84:648-52; PCX Publication 
No. WO 88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO 89/10134). 
^ In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents 
(see, e,g., Krol et al., 1988, Bio/Techniques 6:958-76) or intercalating agents (see, e.g, Zon, 
1988, Pharm. Res. 5:539r49). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g. , a peptide, hybridization triggered aross-linldng agent, transport 
agent, hybridization-triggered cleavage agent, etc. 

10 

IL Isolated Proteins and Antibodies 

One aspect of the invention pertains to isolated proteins, and biologically active 
portions thereof, as well as polypeptide fragments suitable for use as immunogens to raise 
antibodies directed against a polypeptide of the invention. In one embodiment, the native 
^ ^ polypeptide can be isolated from cells or tissue sources by an appropriate purification 
scheme using standard protein purification techniques. In another embodiment, 
polypeptides of the invention are produced by recombinant DNA techniques. Alternative to 
recombinant expression, a polypeptide of the invention can be synthesized chemically using 
standard peptide synthesis techniques. 

An "isolated" or "purified" protein or biologically active portion thereof is 
substantially free of cellular material or other contaminating proteins from the cell or tissue 
source from which the protein is derived, or substantially free of chemical precursors or 
other chemicals when chemically synthesized. The language "substantially free of cellular 
material" includes preparations of protein in which the protein is separated from cellular 
components of the cells from which it is isolated or recombinantly produced. Thus, protein 
that is substantially free of cellular material includes preparations of protein having less 
than about 30%, 20%, 10%, or 5% (by dry weight) of heterologous protein (also referred to 
herein as a "contaminating protein"). When the protein or biologically active portion 
thereof is recombinantly produced, it is also preferably substantially free of culture medium, 
i.e., culture medium represents less than about 20%, 10%, or 5% of the volume of the 
protein preparation. When the protein is produced by chemical synthesis, it is preferably 
substantially free of chemical precursors or other chemicals, i.e., it is separated from 
chemical precursors or other chemicals which aire involved in the synthesis of the protein. 
Accordingly such preparations of the protein have less tiian about 30%, 20%, 10%, 5% (by 
dry weight) of chemical precursors or compounds other than the polypeptide of interest. 
The term "pure" or "isolated" as used herein preferably has the same nmnerical limits as 
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"purified" or * ^isolated" iimnediately above. "Isolated" and "purified" do not encompass 
either natural materials in their native state or natural materials that have been separated into 
components (e.g., in an acrylamide gel) but not obtained either as pure lacking 
contaminating proteins, or chromatography reagents such as denaturing agents and 
polymers, e.g., acrylamide or agarose) substances or solutions. In preferred embodiments, 
^ purified or isolated preparations will lack any contaminating proteins from the same animal 
from which the protein is normally produced, as can be accomplished by recombinant 
expression of, for example, a human protein in a non-human cell. 

Biologically active portions of a polypeptide of the invention include polypeptides 
comprising amino acid sequences sufficiently identical to or derived fi'om the amino acid 
sequence of the protein (e.g., the amino acid sequence shown in any of SEQ ID N0s:2, 5, 8, 
1 1, 14, or 17), which include fewer amino acids than the full length protein, and exhibit at 
least one activity of the corresponding jfiilHength protein. Typically, biologically active 
portions comprise a domain or motif with at least one activity of the corresponding protein. 
A biologically active portion of a protein of the invention can be a polypeptide which is, for 
example, 10, 25, 50, 100 or more amino acids in length. Moreover, other biologically 
active portions, in which other regions of the protein are deleted, can be prepared by 
recombinant techniques and evaluated for one or more of the functional activities of the 
native form of a polypeptide of the invention. 

Preferred polypeptides have the amino acid sequence of SEQ ID N0s:2, 5, 8, 1 1, 14, 
17, 20, 23, 26, or 29. Other useful proteins are substantially identical (e.g,, at least about 
45%, preferably 55%, 65%, 75%, 85%, 95%, or 99%) to any of SEQ ID N0s:2, 5, 8, 11, 14, 
17, 20, 23, 26, or 29 and retain the functional activity of the protein of the corresponding 
naturally-occiuring protein yet differ iii amino acid sequence due to natural allelic variation 
or mutagenesis. 

To determme the percent identity of two amino acid sequences or of two nucleic 
acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be 
introduced in the sequence of a first amino acid or nucleic acid sequence for optimal 
alignment with a second amino or nucleic acid sequence). The amino acid residues or 
nucleotides at corresponding amino acid positions or nucleotide positions are then 
compared. When a position m the first sequence is occupied by the same amino acid 
residue or nucleotide as the corresponding position in the second sequence, then the 
molecules are identical at that position. The percent identity between the two sequences is a 
function of the number of identical positions shared by the sequences (i.e., % identity = # of 
identical positions/total # of positions (e.g^ overlapphig positions) x 100). In one 
embodiment the two sequences are the same length. 
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The determination of percent identity between two sequences can be accomplished 
using a mathematical algorithm, A preferred, non-limiting example of a mathematical 
algorithm utilized for the comparison of two sequences is the algorithm of Karlin and 
Altschul (1990, Proc, Natl Acad. Sci. USA 87:2264-8), modified as in Karlin and Altschul 
(1993, Proc. Natl Acad, Set USA 90:5873-7). Such an algorithm is incorporated into the 
^ NBLAST and XBLAST programs of Altschul et al. (1990, J. Mol Biol 215:403-10). 
BLAST nucleotide searches can be performed with the NBLAST program, score == 100, 
wordlength == 12 to obtain nucleotide sequences homologous to a nucleic acid molecules of 
the invention. BLAST protein searches can be performed with the XBLAST program, 
score = 50, wordlength = 3 to obtain ammo acid sequences homologous to a protein 
molecules of the invention. To obtain gapped alignments for comparison purposes. Gapped 
BLAST can be utilized as described in Altschul et al. (1997, Nucleic Acids Res. 25:3389- 
402). Alternatively, PSI-Blast can be used to perform an iterated search which detects 
distant relationships between molecules {Id.). When utilizmg BLAST, Gapped BLAST, 
and PSI-Blast programs, the default parameters of the respective programs ie.g.^ XBLAST 
and NBLAST) can be used. http://www.ncbi.nlm.nih.gov. Another prefmed, non- 
limiting example of a mathematical algorithm utiUzed for the comparison of sequences is 
the algorithm of Myers and Miller (1988, CABIOS 4:1 1 ^7). Such an algorithm is 
incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence 
ahgnment software package. When utilizing the ALIGN program for comparing amino 

20 

acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap 
pena.lty of 4 can be used. 

The percent identity between two sequences can be determined using techniques 
similar to those described above, with or without allowiag gaps. In calculating percent 
identity, typically exact matches are counted. 

or 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises all or part preferably biologically active) of a 
polypeptide of the invention operably linked to a heterologous polypeptide (i.e., a 
polypeptide other than the same polypeptide of the invention). Within the fusion protein, 
the term "operably linked" is intended to indicate that the polypeptide of the invention and 
the heterologous polypeptide are fused hi-frame to each other. The heterologous 
polypeptide can be fused to the N-terminus or C-terminus of the polypeptide of the 
invention. 

One useful fusion protein is a GST fusion protein in which the polypeptide of the 
invention is fused to the C-terminus of GST sequences. Such fusion proteins can facilitate 
the purification of a recombinant polypeptide of the invention. 
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In anoiiier embodiment, the fusion protein contains a heterologous signal peptide at 
its N-terminus. For example, the native signal peptide of a polypeptide of the invention can 
be removed and replaced with a signal peptide from another protein. For example, the gp67 
secretory sequence of the baculovirus envelope protein can be used as a heterologous signal 
peptide {Current Protocols in Molecular Biology^ 1992, Ausubel et aL> eds., John Wiley & 
^ Sons)* Other examples of eukaryotic heterologous signal peptides include the secretory 
sequences of melittin and human placental alkaline phosphatase (Stratagene; La JoUa, 
California). In yet another example, useful prokaryotic heterologous signal peptides include 
the phoA secretory signal (Sambrook et aL, supra) and tJie protein A secretory signal 
(Pharmacia Biotech; Piscataway, New Jersey). 

In yet another embodiment, the fusion protein is an immimoglobulin fusion protein 
in which all or part of a polypeptide of the invention is fused to sequences derived from a 
member of the immunoglobulin protein family. The immunoglobulin fusion proteins of the 
invention can be incorporated into pharmaceutical compositions and adininistered to a 
subject to itihibit an interaction between a ligand (soluble or membrane-bound) and a 
^ ^ protein on the surface of a cell (receptor), to thereby suppress signal transduction in vivo. 
The immunoglobulin fusion protein can be used to affect the bioavailability of a cognate 
ligand of a polypeptide of the invention. Inhibition of ligand/receptor interaction may be 
useful therapeutically, both for treating proliferative and differentiative disorders and for 
modulating ie,g,^ promoting or inhibiting) cell survival. Moreover, the immunoglobulin 
fusion proteins of the invention can be used as iromunbgens to produce antibodies directed 
against a polypeptide of the invention in a subject, to purify ligands aiid in screening assays 
to identify molecules which inhibit the interaction of receptors with ligands. 

Chimeric and fusion proteins of the invention can be produced by standard 
recombinant DNA techniques. In another embodiment, the fusion gene can be synthesized 
by conventional techniques including automated DNA synthesizers. Alternatively, PGR 
amplification of gene fragments can be carried out using anchor primers which give rise to 
complementary overhangs between two consecutive gene fragments which can 
subsequently be annealed and reamplified to generate a chimeric gene sequence {see, e.g., 
Ausubel et al., supra). Moreover^ many expression vectors are commercially available that 

OA 

already encode a fusion moiety {e.g,, a GST polypeptide). A nucleic acid encoding a 
polypeptide of the invention can be cloned into such an expression vector such that the 
fusion moiety is linked in-frame to the polypeptide of the invention. 

A signal peptide of a polypeptide of the invention (SEQ ID NOs:101, 1 10, 112, 125, 
127, or 132) can be used to facilitate secretion and isolation of the secreted protein or other 
proteins of interest Signal peptides are typically characterized by a cote of hydrophobic 
amino acids which are generally cleaved from the mature protein during secretion in one or 
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more cleavage events. Such signal peptides contain processing sites that allow cleavage of 
the signal peptide from the mature proteins as they pass through the secretory pathv^ay. 
Thus, the invention pertains to the descrihed polypeptides having a signal peptide, as well 
as to the signal peptide itself and to the polypeptide in the absence of the signal peptide (i.e., 
the cleavage products). In one embodiment, a nucleic acid sequence encoding a signal 
peptide of the invention can be operably linked in an expression vector to a protein of 
interest, such as a protein which is ordinarily not secreted or is otherwise difficult to isolate. 
The signal peptide directs secretion of the protein, such as from a eukaryotic host into which 
the expression vector is transformed, and the signal peptide is subsequently or concurrently 
cleaved. The protein can then be readily purified from the extracellular medium by art 
recognized methods. Alternatively, the signal peptide can be linked to the protein of 
interest using a sequence which facilitates purification, such as with a GST domain. 

In anotiier ^bodiment, the signal peptides of the present invention can be used to 
identify regulatory sequences, eg*., promoters, enhancers, repressors. Since signal peptides 
are the most amino-terminal sequences of a peptide, it is expected that the nucleic acids 
which flank the signal peptide on its amino-terminal side will be regulatory sequences 
wliich affect traascription. Thus, a nucleotide sequence which encodes all or a portion of a 
signal peptide can be used as a probe to identify and isolate signal peptides and their 
flanking regions, and these flanking regions can be studied to identify regulatory elements 
therein* 

The present invention also pertains to variants of the polypeptides of tiie invention. 
Such variants have an altered amino acid sequence which can function as either agonists 
(mimetics) or as antagonists. Variants can be generated by mutagenesis, e.g., discrete point 
mutation or truncation. An agonist can retain substantially the same, or a subset, of the 
biological activities of the naturally occurring form of the protein. An antagonist of a 
protein can inhibit one or more of the activities of the naturally occurring form of the 
protein by, for example, competitively binding to a downstream or upstream member of a 
cellular signaling cascade which includes the protein of interest Thus, specific biological 
effects can be elicited by treatment with a variant of limited function. Treatment of a 
subject with a variant having a subset of the biological activities of the naturally occurring 
form of the protein can have fewer side effects in a subject relative to treatment with the 
naturally occurring form of the proteia. 

Modification of the structure of the subject polypeptides can be for such purposes as 
enhancing therapeutic or prophylactic efficacy, stability ex vivo shelf Ufe and 
resistance to proteolytic degradation in vivo), or post-translational modifications (e.^., to 
alter phosphorylation pattern of protein). Such modified peptides, when designed to retain 
at least one activity of the natm-^ly-occurring form of the protein, or to produce specific 



-69- 



antagonists thereof, are considered functional equivalents of the polypeptides described in 
more detail herein. Such modified peptides can be produced, for instance, by amino acid 
substitution, deletion, or addition. 

For example, it is reasonable to expect that an isolated replacement of a leucine with 
an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar 
^ replacement of an amino acid with a structurally related amino acid (Le. isosteric and/or 
isoelectric mutations) will not have a major effect on the biological activity of the resulting 
molecule. 

Whether a change in the amino acid sequence of a peptide results in a functional 
homolog (e,g,, functional in the sense that the resulting polypeptide mimics or antagonizes 
the wild-type form) can be readily determined by assessing the ability of the variant peptide 
to produce a response in cells in a fashion similar to the wild-type protein, or competitively 
inhibit such a response. Polypeptides in which more than one replacement has taken place 
can readily be tested in the same manner. 

Variants of a protein of the invention which function as either agonists (mimetics) or 
as antagonists can be identified by screening combinatorial libraries of mutants, e;g, ^ 
truncation mutants, of the protein of the invention for agonist or antagonist activity. In one 
embodiment, a variegated library of variants is generated by combinatorial mutagenesis at 
the nucleic acid level and is encoded by a variegated gene library. A variegated library of 
variants can be produced by, for example, enzymatically ligating a mixture of synthetic 

20 

oligonucleotides into gene sequences such that a degenerate set of potential protein 
sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion 
proteins (e.g,, for phage display). There are a variety of methods which can be used to 
produce libraries of potential variants of the polypeptides of the invention from a degenerate 
oligonucleotide sequence. Methods for synthesizing degenerate oligonucleotides are known 
m the art (see, e.g,, Narang, 1983, Tetrahedron 39:3; Itakura et aL, 1984, Amu. Rev, 
Biochem. 53:323; Itakura et al, 1984, Science 198:1056; Ike et al, 1983, Nucleic Acid 
ResAVAU). 

In addition, libraries of firagments of the coding sequence of a polypeptide of the 
invention can be used to generate a variegated population of polypeptides, for screening and 
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subsequent selection of variants. For example, a library of coding sequence fragments can 
be generated by treating a double stranded PGR fragment of the coding sequence of interest 
with a nuclease under conditions wherein nicking occurs only about once per molecule, 
denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA 
which can include sense/antisense pairs from different nicked products, removing single 
stranded portions &om reformed duplexes by treatment with SI nuclease, and ligating the 
resulting fragment library into an expression vector. By this method, an expression library 
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can be derived which encodes N-terminal and internal fragments of various sizes of the 
protein of interest. 

Several techniques are known in the art for screening gene products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. The most widely used techniques, which are amenable 
^ to high through-put analysis, for screening large gene libraries typically include cloning the 
gene library into replicable expression vectors, transforming appropriate cells with the 
resulting library of vectors, and expressing the combinatorial genes under conditions in • 
which detection of a desired activity facilitates isolation of the vector encoding the gene 
whose product was detected. Recursive ensemble mutagenesis (REM), a technique which 
enhances the frequency of functional mutants in the libraries, can be used in combination 
with the screening assays to identify variants of a protein of the invention (Arkin and 
Yourvan, 1992, Proc, Natl Acad, Set USA 59:78 ll-S; Delgrave et al., 1993, Protein 
Engineering 6(3):327-3 1). 

Aa isolated polypeptide of the invention, or a fragment thereof, can be used as an 
immunogen to generate antibodies using standard techniques for polyclonal and monoclonal 
antibody preparation. The full-length polypeptide or protein can be used or, alternatively, 
the invention provides antigenic peptide fragments for use as immtmogens. The antigenic 
peptide of a protein of the iavention comprises at least 8 (preferably 10, 15, 20, or 30) 
amino acid residues of the amino acid sequence of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 

20 

26, or 29, and encompasses an epitope of the protein such that an antibody raised against the 
peptide forms a specific immune complex with the protein. 

Preferred epitopes encompassed by the antigenic peptide are regions that are located 
on the surface of the protein, e.g., hydrophilic regions. Hydropathy plots or similar analyses 
can be used to identify hydrophilic regions. 

An immimogen typically is used to prepare antibodies by inununizing a suitable 
subject, (e.g., rabbit, goat, mouse or other mammal). An appropriate immunogenic 
preparation can contain, for example, recombinantly expressed or chemically synthesized 
polypeptide. The preparation can further include an adjuvant, such as Freund's complete or 
incomplete adjuvant, or similar immunostimulatory agent. 

30 

Accordingly, another aspect of the invention pertains to antibodies directed against 
a polypeptide of the invention. The term "antibody" as used herein refers to 
immunoglobulin molecules and immunologically active portions of immunoglobulin 
molecules, i.e., molecules that contain an antigen binding site which specifically binds an 
antigen, such as a polypeptide of the invention, e.g., an epitope of a polypeptide of the 
invention. A molecule which specifically binds to a given polypeptide ofthe invention is a 
moleciile which binds the polypeptide, but does not substantially bind other molecules in a 
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sample, e.g., a biological sample, which naturally contains the polypeptide^ Examples of 
immunologically active portions of immunoglobulin molecules include F(ab) and F(ab*)2 
jfragments which can. be generated by treating the antibody with an enzyme such as pepsin. 
The invention provides polyclonal and monoclonal antibodies. The term "monoclonal 
antibody" or "monoclonal antibody composition", as used herein, refers to a population of 

^ antibody molecules that contain only one species of an antigen binding site capable of 
iinmunoreacting with a particiilar epitope. 

Polyclonal antibodies can be prepared as described above by immunizing a suitable 
subject with a polypeptide of the invention as an immunogen. Preferred polyclonal 
antibody compositions are ones that have been selected for antibodies directed against a 
polypeptide or polypeptides of the invention. Particularly preferred polyclonal antibody 
preparations are ones that contain only antibodies directed against a polypeptide or 
polypeptides of the invention. Particularly preferred immunogen compositions are those 
that contdn no other human proteins such as, for example, immunogen compositions made 
using a non-human host cell for recombinant expression of a polypqptide of the invention. 

'■^ In such a manner^ the only human epitope or epitopes recognized by the resulting antibody 
compositions raised against this immunogen will be present as part of a polypeptide or 
polypeptides of the invention. 

The antibody titer in the immunized subject can be monitored over time by standard 
techniques, such as witti an enayme linked immunosorbent assay (ELIS A) using 
immobilized polypeptide* If desired, the antibody molecules can be isolated from the 
mammal (e,g., from the blood) and ftirther purified by well-known techniques, such as 
protein A chromatography to obtain the IgG fraction. Alternatively, antibodies specific for 
a protein or polypeptide of the invention can be selected for (e.g., partially purified) or 
purified by, e.g.^ affinity chromatography. For example, a recombinantly expressed and 
purified (or partially purified) protein of the invention is produced as described herein, and 
covalently or non-covalently coupled to a solid support such as, for example, a 
chromatography column. The column can then be used to affinity purify antibodies specific 
for the proteins of the invention from a sample containing antibodies directed against a large 
number of different epitopes, thereby generating a substantially purified antibody 
composition, /.e., one that is substantially free of contaminating antibodies. By a 
substantially purified antibody composition is meant, in this context, that the antibody 
sample contains at most only 30% (by dry weight) of contaminating antibodies directed 
against epitopes other than those oh the desired protein or polypeptide of the invention, and 
preferably at most 20%, yet more preferably at most 10%, and most preferably at most 5% 
(bydry weight) ofthe sample is contaminating antibodies. Apurified antibody composition 
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means that at least 99% of the antibodies in the composition are directed against the desired 
protein or polypeptide of the invention. 

At an appropriate time afler immunization, eg., when the specijSc antibody titers are 
highest, antibody-producing cells can be obtained from the subject and used to prepare 
monoclonal antibodies by standard techniques^ such as the hybridoma technique (Kohler 
^ and Milstein, 1975, Nature 256:495-7), the human B cell hybridoma technique (Kozbor et 
al., 1983, Immunol Today 4:72), the EBV-hybridoma technique (Cole et aL, 1985, 
Monoclonal Antibodies and Cancer Therapy^ Alan R. Liss, Inc., pgs. 77-96) or trioma 
techniques. The technology for producing hybridomas is well known (see generally 
Current Protocols in Immunology, 1994, Coligan et al.,eds., John Wiley & Sons, Inc., New 
York, NY). Hybridoma cells producing a monoclonal antibody of the invention are 
detected by screening the hybridoma culture supematants for antibodies that bind the 
polypeptide of interest, e.g., using a standard ELISA assay. 

Alternative to preparing monoclonal antibody-secreting hybridomas, a monoclonal 
antibody directed against a polypeptide of the invention can be identified and isolated by 
screening a recombinant combinatorial immunoglobulin library (e.g*., an antibody phage 
display library) with the polypeptide of interest. Kits for generating and screening phage 
display libraries are commercially available {e.g., the Pharmacia Recombinant Phage 
Antibody System, Catalog No. 27-9400-01 ; and the Stratagene SurfZAPJ Phage Display Kit, 
Catalog No. 240612). Additionally, examples of methods and reagents particularly 
amenable for use in generating and screening antibody display Ubrary can be found in, for 
example, U.S. Patent No. 5,223,409; POT Publication No. WO 92/18619; POT Publication 
No. WO 91/17271; POT PubHcation No. WO 92/20791; PCX Publication No. WO 
92/15679; PCX Publication No. WO 93/01288; PCX Publication No. WO 92/01047; PCX 
Publication No. WO 92/09690; PCX PubUcation No. WO 90/02809; Fuchs et aL, 1991, 
Bio/Technology 9:1370-2; Hay et al., 1992, Hum. Antibod. Hybridomas 3:81-5; Huse et al., 
1989, Science 246:1275-81; Griffiths et al, 1993, EMBOJ, 12:725-34. 

Additionally, recombiaant antibodies, such as chitneric and humanized monoclonal 
antibodies, comprising both human and non-human portions, which can be made using 
standard recombitiant DNA techniques, are within the scope of the invention. A chimeric 
antibody is a molecule in which different portions are derived from different animal species, 
such as those having a variable region derived from a mvirine mAb and a human 
immunoglobulin constant region. (See, e.g., CabiUy et al., U.S. Patent No. 4,816,567; and 
Boss et al, U.S. Patent No, 4,816,397, which are incorporated herein by reference in their 
entirety.) Humanized antibodies are antibody molecules from non-human species havhig 
one or more complementarity determining regions (CDRs) from the non-human species and 
a framework region from a human immunoglobuUn molecule. (See, e.g. , Queen, U.S. 
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Patent No. 5,585,089, which is incorporated herein by reference in its entirety*) Such 
chimeric and humanized monoclonal antibodies can be produced by recombinant DNA 
techniques known in the art, for example using methods described in PCX Publication No. 
WO 87/02671; European Patent Application 184,187; European Patent Application 
171,496; European Patent Application 173,494; PCT Publication No. WO 86/01533; U,S, 
^ Patent No. 4,816,567; European Patent Application 125,023; Better et aL, 1988, Science 
240:1041-3; Liu et al., 1987, Proc. Natl Acad. Set USA 84:3439-43; Liu et al, 1987, 
ImmunoL 139:3521-6; Sun et al., 1987, ProcNatl Acad. Sci USA 84:214-8; Nishimura et 
al., 1987, Cane. Res. 47:999-1005; Wood et al., 1985, Nature 314:446-9; and Shaw et al., 
1988, J, Natl Cancer Inst 80:1553-9; Morrison, 1985, Science 229:1202-7; Oi et aL, 1986, 
Bio/Techniques 4:214; U.S. Patent 5,225,539; Jones et aL, 1986, Nature 321:522-5; 
Verhoeyan et aL, 1988, Science 239:1534; and Beidler et aL, 1988, J. Immunol 141:4053- 
60. 

Completely human antibodies are particularly desirable for therapeutic treatment of 
human patients. Such antibodies can be produced , for example, using transgenic mice 

1 ^ 

which are incapable of expressing endogenous immunoglobulin heavy and light chains 
genes, but which can express human heavy and light chain genes. The' transgenic mice are 
immunized in the normal fashion with a selected antigen, e.g., all or a portion of a 
polypeptide of the invention. Monoclonal antibodies directed against the antigen can be 
obtained using conventional hybridoma technology. The human immunoglobulin 

20 

transgenes harbored by the transgenic mice rearrange during B cell differentiation, and 
subsequently xmdergo class switching and somatic mutation. Thus, using such a technique, 
it is possible to produce therapeutically useful IgG, IgA and IgE antibodies. For an 
overview of tliis technology for producing human antibodies, see Lonberg and Huszar 
(1995, Int Rev. Immunol 13:65-93). For a detailed discussion of this technology for 
producing human antibodies and human monoclonal antibodies and protocols for producing 
such antibodies, see, e.g., U.S. Patent 5,625,126; U.S. Patent 5,633,425; U.S. Patent 
5,569,825; U.S. Patent 5,661,016; and U.S. Patent 5,545,806. In addition, companies such 
as Abgenix, Inc. (Freemont, CA), can be engajged to provide human antibodies directed 
against a selected antigen using technology similar to that described above. 

30 

Completely himian antibodies which recognize a selected epitope can be generated 
using a technique referred to as "guided selection." In this approach a selected non-human 
monoclonal antibody, e.g,, a murine antibody, is used to guide the selection of a completely 
human antibody recognizing the same epitope. (Jespers et aL, 1994, Bio/technology 
12:899-903). 

Further, an antibody (or jfragment thereof) may be conjugated to a therapeutic 
- moiety such as a cj^otoxui, a therapeutic agent or a radioactive metal ion. A cytotoxin or 
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cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, 
cytochalasin B, gramicidin D, ethidiuni bromide, emetine, mitomycin, etoposide, 
tenoposide, vincristine, vinblastine, colchicin, doxcrabicin, daunorubicin, dihydroxy 
anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydro testosterone, 
glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or 
homologs thereof Therapeutic agents include, but are not limited to antimetabolites (ag., 
methotrexate, d-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), 
alkylating agents (e.g., mecUorethamine, thiepa chlorambucil, melphalan, carmustine 
(BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, 
streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (I) (IDP) cisplatin), 
anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics 
(e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin 
(AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine). The conjugates of the 
invention can be used for modifying a given biological response, the drug moiety is not to 
be construed as limited to classical chemical th^apeutic agents. For example, the drug 
moiety may be a protein or polypeptide possessing a desired biolojgical activity. Such 
proteins may include, for example , a toxm such as abrin, ricin A, pseudomonas exotoxin, 
or diphtheria toxin; a protein such as tumor necrosis factor, oc-interferon, P-interferon, nerve 
growth factor, platelet derived growth factor, tissue plasminogen activator; or biological 
response modifiers such as, for example, lymphokines, interleukin-1 ("IL-l")? interleukin-2 . 
("IL-2"), interleukin-6 (*TL-6")5 granidocyte macrophage colony stimulating factor ("GM- 
CSF**)> granulocyte colony stimulating factor ("G-CSF"), granulocyte colony stimulating 
factor ("G-CSF'O, or other growth factors. 

Techniques for conjugating such therapeutic moiety to antibodies are well known, 
see, e.g., Amon et aL, "Monoclonal Antibodies for Immunotargeting of Drugs in Cancer 
Therapy,'* in Monoclonal Antibodies and Cancer Therapy, 1985, Reisfeld et aL, eds., pgs. 
243-56; Hellstrom et aL, "Antibodies For Drug Delivery,'' in Controlled Drug Delivery 2"^^ 
, 1987, Robinson et al., eds.; Thorpe, "Antibody Carriers of Cytotoxic Agents in Cancer 
Therapy: A Review," in Monoclonal Antibodies '84 Biological and Clinical Applications, 
1985, Pinchera et aL, eds, pgs. 475-506; "Analysis, Results, and Future Prospective of the 
Therapeutic Use of Radiolabeled Antibody in Cancer Therapy," yyx Monoclonal Anitbodies 
for Cancer Detection and Therapy, 1985, Baldwin et aL, eds., pgs. 303-16; and Thorpe et 
aL,1982, Immunol Rev., 62:119-58. Alternatively, an antibody can be conjugated to a 
second antibody to form an antibody heteroconjugate as described by Segal in U.S. Patent 
No. 4,676,980. 

An antibody directed against a polypeptide of the invention (e.g., monoclonal 
antibody) can be used to isolate the polypeptide by standard techniques, such as affinity 
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chromatography or immunoprecipitation. Moreover, such an antibody can be used to detect 
the protein (e.g,, in a cellular lysate or cell supernatant) in order to evaluate the abundance 
and pattern of expression of the polypeptide. The antibodies can also be used diagnostically 
to monitor protein levels in tissue as part of a clinical testing procedure, e.g.^ to, for 
example, determine the efficacy of a given treatment regimen. Detection can be facilitated 
^ by coupling the antibody to a detectable substance. Examples of detectable substances 
include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, 
bioluminescent materials, and radioactive materials. Examples of suitable enzymes include 
horseradish peroxidase, alkaline phosphatase, 8-galactosidase, or acetylcholinesterase; 
examples of suitable prosthetic group complexes include streptavidin/biotin and 
avidin^iotin; examples of suitable fluorescent materials include ttmbelliferone, fluorescein, 
fluorescein isofhiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride 
or phycoerythrin; an example of a luminescent material includes luminol; examples of 
bioluminescent materials include luciferase, luciferin, and aequorin, and examples of 
suitable radioactive material include " % ^^S or ^H. 

Piirther, an antibody (or fragment thereof) can be conjugated to a therapeutic moiety 
such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A c3d:otoxin or cytotoxic 
agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasiti 
B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, 
vinblastine, colchicin, doxorubicin, daimorubicin, dihydroxy anthracin dione, mitoxantrone, 
mithramycia, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, 
lidocaine, propranolol, and puromycin and analogs or homologs thereof Therapeutic 
agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6- 
mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents 
(e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and 
lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, 
mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g, 
daimorubicin (formerly daunomycin) and doxorubicin), antibiotics (e,g.^ dactinomycin 
(formerly actinomycin), bleomycin, mithramycia, and anfhramycin (AMC)), and anti- 
mitotic agents (e,g, vincristine and vinblastine). 

The conjugates of the invention can be used for modifying a given biological 
response, the drug moiety is not to be construed as limited to classical chemical therapeutic 
agents. For example, the drug moiety may be a protein or polypeptide possessing a desired 
biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, 
pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, oc- 
interferon, P-interferon, nerve growth factor, platelet derived growth factor, tissue 
plasminogen activator; or, biological response modifiers such as, for example, lymphokines, 
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interleukin-l ("IL-1"). mterleukin-2 ("IL-2"), interleiikin-6 ("IL-6"), granulocyte 
macrophage colony stimulating factor ("GM-CSF"), granulocyte colony stimulating factor 
("G-CSF'X or other growth factors. 

Techniques for conjugating such therapeutic moiety to antibodies are well known, 
see, e.^., Amon et aL, "Monoclonal Antibodies For hnmunotargeting Of Drugs In Cancer 

^ Therapy", in Monoclonal Antibodies And Cancer Therapy, 1985, Reisfeld et aL (eds,)? pgs. 
243-56, Alan R. Liss, Inc.; Hellstrom et aL, "Antibodies For Drug Delivery", in Controlled 
Drug Delivery (2nd EA), 1 987, Robinson et al. (eds.), pgs. 623-53, Marcel Dekker, Inc.; 
Thorpe, "Antibody Carriers Of Cytotoxic Agents In Cancer Therapy: A Review", in 
Monoclonal Antibodies '84: Biological And Clinical Applications, 1985, Piuchera et ah 
(eds.), pgs. 475-506; "Analysis, Results, And Future Prospective Of The Therapeutic Use 
Of Radiolabeled Antibody In Cancer Therapy", in Monoclonal Antibodies For Cancer 
Detection And Therapy, 1985, Baldwin et al. (eds.), pgs. 303-16, Academic Press, and 
Thorpe et al., "The Preparation And Cytotoxic Properties Of Antibody-Toxin Conjugates", 
Immunol. Rev., 1 982, 62: 1 19-58. 

^•^ Alternatively, an antibody can be conjugated to a second antibody to form an 

antibody heteroconjugate as described by Segal in U.S. Patent No. 4,676,980. 
Accordingly, in one aspect, the invention provides substantially purified antibodies or 
fragment thereof, and human or non-human antibodies or fragments thereof, which 
antibodies or fragments specifically bind to a polypeptide comprisiag an amino acid 

7.0 

sequence selected from the group consistmg of: the amino acid sequence of any one of SEQ 
ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29; or an amino acid sequence encoded by the 
cDNA of a clone deposited as ATCC® Accession Number 207178, ATCC® Accession 
Number PTA-249, or ATCC® Accession Number PTA-250; a fragment of at least 15 amino 
acid residues of the amino acid sequence of any one of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 
23, 26j or 29; an amino acid sequence which is at least 95% identical to the amino acid 
sequence of any one of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 23, 26, or 29, wherein the 
percent identity is determined using the ALIGN program of the GCG software package with 
a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4; and an 
amino acid sequence which is encoded by a nucleic acid molecule which hybridizes to the 
nucleic acid molecule consisting of any one of SEQ ID NOs:l, 3, 4, 6, 7, 9, 10, 12, 13, 15, 
16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited as ATCC® 
Accession Number 207178, ATCC® Accession Number PTA-249, or ATCC® Accession 
Number PTA-250, or a complement thereof; under conditions of hybridization of 6X SSC at 
45''C and washing in 0.2 X SSC, 0.1% SDS at 65°C. In various embodiments, the 

or 

substantially purified antibodies of the invention, or fragments thereof, can be human, non- 
human, chimeric and/or humanized antibodies. 
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La another aspect, the invention provides human or non-human antibodies or 
fragments thereof, which antibodies or fragments specifically bind to a polypeptide 
comprising an amino acid sequence selected from the group consisting of: the amino acid 
sequence of any one of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, or an amino acid 
sequence encoded by the cDNA of a clone deposited as ATCC^ Accession Number 207178, 
ATCC® Accession Number PTA-249, or ATCC® Accession Number PTA-250; a fragment 
of at least 15 amino acid residues of the amino acid sequence of any one of SEQ ID NOs: 2, 
5, 8, 1 1, 14, 17, 20, 23, 26, or 29, an amino acid sequence which is at least 95% identical to 
the amino acid sequence of any one of SEQ ID NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, 
wherein the percent identity is determined using the ALIGN program of the GCG software 
package with a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty 
of 4; and an amino acid sequence which is encoded by a nucleic acid molecule which 
hybridizes to the nucleic acid.molecixle consisting of any one of SEQ ID NOs:l, 3, 4, 6, 7, 
9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited 
as ATCC® Accession Number 207178, ATCC® Accession Number PTA-249, or ATCC® 
Accession Number PTA-250, or a complement thereof, under conditions of hybridization of 
6X SSC at 45''C and washing in 0.2 X SSC, 0. 1 % SDS at 65°C. Such non-human 
antibodies can be goat, mouse, sheep, horse, chicken, rabbit, or rat antibodies. 
Alternatively, the non-human antibodies of the invention can be chimeric and/or humanized 
antibodies. In addition, the human or non-human antibodies of the invention can be 
polyclonal antibodies or monoclonal antibodies. 

In still a fturther aspect, the invention provides monoclonal antibodies or fragments 
thereof, which antibodies or fragments specifically bind to a polypeptide comprising an 
amino acid sequence selected from the group consisting of: the amino acid sequence of any 
one of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, or an amino acid sequence 
encoded by the cDNA of a clone deposited as ATCC® Accession Number 207178, ATCC® 
Accession Number PTA-249, or ATCC® Accession Number PTA-250; a fragment of at 
least 15 amino acid residues of the amino acid sequence of any one of SEQ ID NOs:2, 5, 8, 
1 1, 14, 17, 20, 23, 26, or 29, an amino acid sequence which is at least 95% identical to the 
amino acid sequence of any one of SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, 
wherein the percent identity is determined using the ALIGN program of the GCG software 
package with a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty 
of 4; and an amino acid sequence which is encoded by a nucleic acid molecule which 
hybridizes to the nucleic acid molecule consisting of any one of SEQ ID NOs: 1, 3, 4, 6, 7, 
9, 10, 12, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited 
as any of ATCC® Accession Number 207178, ATCC® Accession Number PTA-249, or 
ATCC® Accession Number PTA-250, or a complement thereof, under conditions of 
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hybridization of 6X SSC at 45°C and washing in 0.2 X SSC, 0,1% SDS at eS^'C. Th^ 
monoclonal antibodies can be human, humanized, chimeric and/or non-human antibodies. 

The substantially purified antibodies or firagments thereof specifically bind to a 
signal peptide, a secreted sequence, an extracellular domain, a transmembrane or a 
cytoplasmic domain cytoplasmic membrane of a polypeptide of the invention. In a 
^ particularly preferred embodiment, the substantially purified antibodies or jSragments 
thereof, the human or non-human antibodies or fragments thereof, and/or the monoclonal 
antibodies or fragments thereof, of flie invention specifically bind to a secreted sequence or 
an extracellular domain of the amino acid sequence of SEQ ID NOs:103, 107, 1 14, 1 18, 
122, 129, or 134. Preferably, the secreted sequence or extracellular domain to which the 
antibody, or jBragment thereof, binds comprises firom about arniao acids 25-374 of SEQ ID 
N0:5 (SEQ ID NO:103), JS'om amino acids 1-73 of SEQ ID N0:8 (SEQ ID NO:107), from 
amino acids 21-767 of SEQ ID N0:14 (SEQ ID NO:114), from amino acids 1-216 of SEQ 
ID NO:17 (SEQ ID NO:118), from amino acids 1-500 of SEQ ID NO:20 (SEQ ID NO:122) 
from amino acids 20-169 of SEQ ID NO:26 (SEQ ID NO:129), and from amino acids 22- 
15 244ofSEQIDNO:29(SEQIDNO:134), 

Any of the antibodies of the invention can be conjugated to a therapeutic moiety or 
to a detectable substance. Non-limiting examples of detectable substances that can be 
conjugated to the antibodies of the invention are an enzyme, a prosthetic group, a 
fluorescent material, a luminescent material, a bioluminescent material, and a radioactive 
material. 

The invention also provides a kit containing an antibody of the invention conjugated 
to a detectable substance, and instructions for use. Still another aspect of the invention is a 
pharmaceutical composition comprising an antibody of the invention and a 
pharmaceutically acceptable carrier. In preferred embodiments, the pharmaceutical 
composition contains an antibody of the invention, a therapeutic moiety, and a 
pharmaceutically acceptable carrier. 

Still another aspect of the invention is a method of making an antibody that 
specifically recognizes INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, 
TANGO 295, TANGO 354, and TANGO 378, the method comprismg immunizmg a 

30 

mammal with a polypeptide. The polypeptide used as an immimogen comprises an amino 
acid sequence selected from the group consisting of: the amino acid sequence of any one of 
SEQ ID NOs:2, 5, 8, 11, 14, 17, 20, 23, 26, or 29, or an amino acid sequence encoded by 
the cDNA of a clone deposited as ATCC® Accession Number 207178, ATCC® Accession 
Number PTA-249, or ATCC® Accession Number PTA-250; afragment of atleast 15 amino 
acid residues of the amino acid sequence of any one of SEQ ID N0s:2, 5, 8, 1 1, 14, 17, 20, 
23, 26, or 29, an amino acid sequence which is at least 95% identical to the amino acid 
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sequence ofanyone of SEQ lb N0s:2, 5, 8, 11, 14^, 17,20,23,26, or 29, wherein the 
percent identity is determined using the ALIGN program of the GCG software package with 
a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4; aad an 
amino acid sequence which is encoded by a nucleic acid molecule which hybridizes to the 
nucleic acid molecule consisting of any one of SEQ ID NOs: 1, 3, 4, 6, 7, 9, 10, 12, 13, 15, 
16, 18, 19, 21, 22, 24, 25, 27, 28, or 30, or the cDNA of a clone deposited as ATCC® 
Accession Number 207178, ATCC® Accession Number PTA-249, or ATCC® Accession 
NTxmber PTA-250, or a complement thereof, under conditions of hybridization of 6X SSC at 
45°C and washing in 0.2 X SSC, 0.1% SDS at 65°C. After immunization, a sample is 
collected from the mammal that contains an antibody that specifically recognizes GPVL 
Preferably, the polypeptide is recombinantly produced using a non-hmnan host cell 
Optionally, the antibodies can be ftixther purified from the sample using techniques well 
known to those of skill in the art. The method can further comprise producing a 
nionoclonal antibody-producing cell from the cells of the mammal. Optionally, antibodies 
are collected from the antibody-producing cell. 



m. Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression vectors, 
containing a nucleic acid encoding a polypeptide of the invention (or a portion thereof). As 
used herein, the term "vector" refers to a nucleic acid molecule capable of transporting 
another nucleic acid to which it has been linked. One type of vector is a "plasmid", which 
refers to a circular double stranded DNA loop into which additional DNA segments can be 
ligated. Another type of vector is a viral vector, wherein additional DNA segments can be 
ligated into the viral genome. Certain vectors are capable of autonomous replication in a 
host cell into which they are introduced (e,g,, bacterial vectors having a bacterial origin of 
replication and episomal mammalian vectors). Other vectors (e.g;, non-episomal 
mammalian vectors) are integrated into the genome of a host cell upon introduction into the 
host cell, and thereby are replicated along with the host genome. Moreover, certain vectors, 
expression vectors, are capable of directing the expression of geues to which they are 
operably linked. In general, expression vectors of utility in recombinant DNA techniques 
are often in the form of plasmids (vectors). However, the invention is intended to include 
such other forms of expression vectors, such as viral vectors (e.g., replication defective 
retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent ftinctions. 

The recombinant expression vectors of the invention comprise a nucleic acid of the 
invention in a form suitable for expression of the nucleic acid in a host cell. This means 

or 

that the recombinant expression vectors include one or more regulatory sequences, selected 
on the basis of the host cells to be used for expression, which is operably linked to the 
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nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably 
linked" is intended to mean that the nucleotide sequence of interest is linked to the 
regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence 
{e.g.y in an in vitro traascription/translation system or in a host cell when the vector is 
introduced into the host cell). The term "regulatory sequence" is intended to include 
^ promoters, enhancers and other expression control elements {e.g.y polyadenylation signals). 
Such regulatory sequences are described, for example, in Goisddel, Gene Expression 
Technology: Methods in Enzymology, 1990^ Academic Press, San Diego, CA. Regulatory 
sequences include those which direct constitutive expression of a nucleotide sequence in 
many types of host cell and those which direct expression of the nucleotide sequence only 
in certain host cells (e,g., tissue-specific regulatory sequences). It will be appreciated by 
those skilled in the art that the design of the expression vector can depend on such factors as 
the choice of the host cell to be transformed, the level of expression of protein desired, etc. 
The expression vectors of the invention can be introduced into host cells to thereby produce 
proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as 
described herein. 

The recombinant expression vectors of the invention can be designed for expression 
of a polypeptide of the invention in prokaryotic (e.g., E. coli) or eukaryotic cells {e,g^ insect 
cells (using baculo virus expression vectors), yeast cells or mammalian cells). Suitable host 
cells are discussed further in Goeddel, supra. Alternatively, the recombinant expression 
vector can be transcribed and translated.m vitro^ for example using T7 promoter regulatory 
sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in E, coli with 
vectors containing constitutive or inducible promoters directing the expression of either 
fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
■ encoded therein, usually to the amino terminus of the recombinant protein. Such fusion 
vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) 
to increase the solubility of the recombinant protein; and 3) to aid in the purification of the 
recombinant protein by acting as a ligand in affinity purification. Often, in fiision 
expression vectors, a proteolytic cleavage site is introduced at the jimction of the fusion 
moiety and the recombmant protem to enable separation of the recombinant protein firom 
the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their 
cognate recognition sequences, include Factor Xa^ thrombin and enterokinase. Typical 
fusion ejjqpression vectors include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988, 
Gene 67:3 1-40), pMAL (New England Biolabs, Beverly, MA) and pRITS (Pharmacia, 
Piscataway, NJ) which fuse glutathione S-transferase (GST), maltose E binding protem, or 
protein A, respectively, to the target recombinant protein. 

-81- 



BNSDOCID: <W0 ^0100673A1JB> 



Examples of suitable inducible non-fusion^, coli expression vectors include pTrc 
(Amam et al., 1988, Gene 69:301-15) and pET lid (Studier et al.. Gene Expression 
Technology: Methods in Enzymology, 1990, Academic Press, San Diego^ CApgs. 60-89). 
Target gene expression from the pTrc vector relies on host RNA polymerase transcription 
from a hybrid trp-lac fusion promoter. Target gene expression from the pET lid vector 
^ relies on transcription from a T7 gtil 04ac ftision promoter mediated by a coexpressed viral 
RNA polymerase (T7 gnl). This viral polymerase is supplied by host strains BL21 (DE3) or 
HMS174(DE3) from a resident prophage harboring a T7 gnl gene under the transcriptional 
control of the lacUV 5 promoter. 

One strategy to maximize recombinant protein expression in E. coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the 
recombinant protein (Gottesman, Expression Technology: Methods in Enzymology, 
1 990, Academic Press, San Diego, CA pgs. 1 1 9-128). Another strategy is to alter the 
nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utilized in E. coli (Wada et 
al., 1992, Nucleic Acids Res, 20:21 1 1-8). Such alteration of nucleic acid sequences of the 
invention can be carried out by standard DNA synthesis techniques. 

In another embodiment, the expression vector is a yeast expression vector. 
Examples of vectors for expression in yeast S. cerivisae include pYepSecl (Baldari et aL, 
1987, EMBO J. 6:229-34), pMFa (Kurjaa and Herskowitz, 1982, Cell 30:933-43), pJRY88 
(Schultz et al, 1987, Gene 54:1 13-23), pYES2 (Invitrogen Corporation, San Diego, CA), 
and pPicZ (Invitrogen Corp, San Diego, CA). 

Alternatively, the expression vector is a baculovims expression vector. Baculovirus 
vectors available for expression of proteins in cultured insect cells {e.g., Sf 9 cells) include 
the pAc series (Smith et al., 1983, Mol Cell Biol 3:2156-65) and the pVL series (Lucklow 
and Summers, 1989, Virology 170:31-9). 

In yet another embodiment, a nucleic acid of the invention is expressed in 
mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, 1987, Nature 329:840) and pMT2PC (Kaufinan 
et al., 1987, EMBO J, 6:187-95). When used in mammalian cells, the expression vectors 

30 

control ftmctions are often provided by viral regulatory elements. For example, commonly 
used promoters are derived from polyoma. Adenovirus 2, cytomegalovirus and Simian 
Virus 40. For other suitable expression systems for both prokaiyotic and eukaryotic cells 
see chapters 16 and 17 of Sambrook et al., supra* 

In another embodiment, the recombinant mammaHan expression vector is capable of 
directing expression of the nucleic acid preferentially in a particular cell type {e.g^ tissue- 
specific regulatory elements are used to e3q)ress the nucleic acid). Tissue-specific 
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regulatory elements are known in the art. Non-limiting examples of sidtable tissue-specific 
promoters include the albumin promoter (liver-specific; Pinkert et al., 1987, Genes Dev. 
1:268-77), lymphoid-specific promoters (Calame and Eaton, 19^%^ Adv. Immunol 43:235- 
75), in particular promoters of T cell receptors (Winoto and Baltimore, 1989, EMBOJ. 
8:729-33) and immunoglobulins (Banerji et al., 1983, Cell 33:729-40; Queen and 
Baltimore, 1983, Cell 33:741-8), neuron-specific promoters {e.g., the neurofilament 
promoter; Byrne and Ruddle, 1989, Proc. Natl Acad. Sci. USA 86:5473-7), pancreas- 
specific promoters (Edlund et al., 1985, Science 230:912-6), and mammary gland-specific 
promoters (e.g.,milkwhey promoter; U.S. Patent No. 4,873,316 and European Application 
Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for 
example the murine hox promoters (Kessel and Gruss, 1990, Science 249:374-9) and the oc- 
fetoprotein promoter (Campes and Tilghman, 1989, Genes Dev. 3 :537-46). 

The invention further provides a recombinant expression vector comprising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientation. 
That is, the DNA molecule is operably linked to a regulatory sequence in a maimer which 
allows for expression (by transcription of the DNA molecule) of an RNA molecule which is 
antisense to the mRNA encoding a polypeptide of the invention. Regulatory sequences 
operably linked to a nucleic acid cloned in the antisense orientation can be chosen which 
direct the continuous expression of the antisense RNA molecule in a variety of cell types, 
for instance viral promoters and/or enhancers, or regulatory sequences can be chosen which 
durect constitutive, tissue specific or cell type specific expression of antisense RNA. The 
antisense expression vector can be in the form of a recombinant plasmid, phagemid or 
attenuated virus in which antisense nucleic acids are produced imder the control of a high 
efficiency regulatory region, the activity of which can be detennined by the cell type into 
which the vector is introduced. For a discussion of the regulation of gene expression using 
antisense genes see Weintraub et al, (1985, Reviews - Trends in Genetics l(l):22-5). 

Another aspect of the invention pertains to host cells into which a recombinant 
expression vector of the invention has been introduced. The terms "host cell" and 
"recombinant host cell" are used interchangeably herein. It is understood that such terms 
refer not only to the particular subject cell but to Uie progeny or potential progeny of such a 
cell. Because certain modifications may occur in succeeding generations due to either 
mutation or environmental influences, such progeny may not, in fact, be identical to the 
parent cell, but are still included within tlie scope of the term as used herein. 

A host cell can be any prokaryotic (e.g.^E. coli) or eukaryotic cell (e.g., insect cells, 
yeast or mammalian cells). 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 
transformation or transfection techniques. As used herein, the terms "transformation" and 



-83- 



"transfection-' are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid into a host cell, including calcium phosphate or calcium chloride co- 
precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation. 
Suitable methods for transforming or transfecting host cells can be foimd in Sambrook, et 
al. (supra), and other laboratory manuals. 

■ For stable transfection of mammalian cells, it is known that, depending upon the 

expression vector and transfection technique used, only a small fraction of cells may 
integrate the foreign DNA into their genome. la order to identify and select these 
integrants, a gene that encodes a selectable marker for resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest. Preferred selectable 
markers include those which confer resistance to dmgs, such as G41 8, hygromycin and 
methotrexate^ Cells stably transfected with the introduced nucleic acid can be identified by 
drug selection cells that have incorporated the selectable marker gene will survive, 
while the other cells die). 

In another embodiment, the expression characteristics of an endogenous , 
INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, 
and TANGO 378) nucleic acid within a cell, cell line or microorganism may be modified by 
inserting a DNA regulatory element heterologous to the endogenous gene of interest into 
the genome of a cell, stable cell line or cloned microorganism such that the inserted 
regulatory element is operatively linked with the endogenous gene (e.g-., INTERCEPT 340, 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378) 
and controls, modulates or activates the endogenous gene. For example, endogenous 
INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, 
and TANGO 378 which are normally "transcriptionally silent", i.e., INTERCEPT 340, 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 

^ genes which are normally riot expressed, or are expressed only at very low levels in a cell 
• line or microorganism, may be activated by inserting a regulatoiy elemmt which is capable 
of promoting the expression of a normally expressed gene product in that cell line or 
microorganism. Alternatively, transcriptionally silent, endogenous INTERCEPT 340, 
MANGO 003, MANGO 347, TANGO 272, TANGO 295, TANGO 354, and TANGO 378 
genes may be activated by insertion of a promiscuous regulatory element that works across 
cell types* 

A heterologous regulatory element may be inserted into a stable cell line or cloned 
microorganism, such that it is operatively linked with and activates e^qpression of 
endogenous INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378 genes, using techniques, such as targeted homologous 
recombination, which are well known to those of skill in the art> and described in 
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Chappel, U.S. Patent No. 5,272,071 ; PCX publication No. WO 91/06667, pubKshed May 
16, 199L 

A host cell of the invention, such as a prokaryotic or eukaiyotic host cell in culture, 
can be used to produce a polypeptide of the invention. Accordingly, the invention further 
provides methods for producing a polypeptide of the invention using the host cells of the 
invention. In one embodiment, the method comprises culturing the host cell of invention 
(into which a recombinant expression vector encoding a polypeptide of the invention has 
been introduced) in a suitable medium such that the polypeptide is produced. la another 
embodiment, the method further comprises isolatiag the polypeptide from the medium or 
the host cell. 

The host cells of the invention can also be used to produce nonhumaa transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized oocj^e 
or an embryonic stem cell into which a sequences encoding a polypeptide of the invention 
have been introduced. Such host cells can then be used to create non-human transgenic 
animals in which exogenous sequences encoding a polypeptide of the invention have been 
introduced into their genome or homologous recombinant animals in which endogenous 
encoding a pol3/peptide of the invention sequences have been altered. Such animals are 
useful for studying the function and/or activity of the polypeptide and for identifying and/or 
evaluating modulators of polypeptide activity. As used herein, a "transgenic animar^ is a 
non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, 
in which one or more of the cells of the animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, 
amphibians, etc. A transgene is exogenous DNA which is integrated into the genome of a 
cell from which a transgenic animal develops and which remains in the genoms of the 
mature animal, thereby directing the expression of an encoded gene product in one or more 
cell types or tissues of the transgenic animaL As used herein, an "homologous recombinant 
animal" is a non-human animal, preferably a mammal, more preferably a mouse, in which 
an endogenous gene has been altered by homologous recombination between the 
endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, 
e.g., an embryonic cell of the animal, prior to developm^t of the animal. 

A transgenic animal of the invention can be created by introducing nucleic acid 
encoding a polypeptide of the invention (or a homologue thereof) into the male pronuclei of 
a fertilized oocyte, e.g.^ by microinjection, retroviral infection, and allowing the oocyte to 
develop in a pseudopregnant female foster animaL Intronic sequences and polyadenylation 
signals can also be included in the transg^e to increase the efficiency of expression of the 
transgene. A tissue-specific regulatory sequence(s) can be operably linked to the transgene 
to direct expression of the polypqptide of the invention to particular cells. Methods for 
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geaerating transgenic animals via embryo liianiptilation and microinjection, particularly 
animals such as mice, have become conventional in the art and are described, for example, 
in U.S. Patent NOs. 4,736,866; 4,870,009; 4,873,191 and in Hogan (Manipulating the 
Mouse Embryo, 1986, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). 
Similar methods are used for production of other transgenic animals, A transgenic founder 

^ animal can be identified based upon the presence of the transgene in its genome and/or 
expression of mRNA encoding the transgene in tissues or cells of the animals. A transgenic 
founder animal can then be used to breed additional animals carrying the transgene. 
Moreover, transgenic animals carrying the transgene can further be bred to oth^ transgenic 
animals carrying other transgenes. 

To create an homologous recombinant animal, a vector is prepared which contains at 
least a portion of a gene encoding a polypeptide of the invention into which a deletion, 
addition or substitution has been introduced to thereby alter, e.g. , functionally disrupt, the 
gene. In a preferred embodiment, the vector is designed such that, upon homologous 
recombination, the endogenous gene is functionally disrupted (i.e., no longer encodes a 
functional protein; also referred to as a "knock out" vector). Alteroatively, the vector can be 
designed such that, upon homologous recombination, the endogenous gene is mutated or 
otherwise altered but still encodes functional protein the upstream regulatory region 
can be altered to thereby alter the expression of the endogenous protein). In the 
homologous recombination vector, the altered portion of the gene is flanked at its 5' and 3* 
ends by additional nucleic acid of the gene to allow for homologous recombination to occur 
between the exogenous gene carried by the vector and an endogenous gene in an embryonic 
stem cell, The additional flanking nucleic acid sequences are of sufficient length for 
successfulhomologous recombination with the endogenous gene. Typically, several 
kilobases of flanking DNA (both at the 5' and 3* ends) are included in the vector (see, e.g., 

'^^ Thomas and Capecchi, 1987, Cell 51:503 for a description of homologous recombination 
vectors). The vector is introduced into an embiyonic stem cell line (e.^., by electroporation) 
and cells in which the introduced gene has homologously recombined with the endogenous 
gene are selected {see, e.g., Li et al., 1992> Cell 69:915). The selected cells are then injected 
into a blastocyst of an animal (e.g. a mouse) to form aggregation chimeras {see, e.g, 
Bmdley m Teratocarcinornas and Embjyonic Stem Cells: A Practical Approach, 19^7^ . 
Robertson, ed., IRL, Oxford pgs, 113-52). A chimeric embryo can then be implanted into a 
suitable pseudopregnant female foster animal and the embryo brought to term. Progeny 
harboring the homologously recombined DNA in their germ cells can be used to breed 
animals in which all cells of the animal contain tte homologously recombined DNA by 
germline transmission of the transgene* Methods for constructing homologous 
recombination vectors and homologous recombinant animals are described further in 
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Bradley, 1991, Current Opinion in Bio/Technology 2:823-9 and in PCT Publication NOs. 
WO 90/11354, WO 91/01140, WO 92/0968 and WO 93/04169. 

In another embodiment, transgenic non-hnman animals can be produced which 
contain selected systems which allow for regulated expression of the transgene. One 
example of such a system is the cre/loxP recombinase system of bacteriophage PI. For a 
description of the cr^/7oxP recombinase system, see, e.g,, Lakso et al., 1992, Proc. Natl 
Acad, Set USA 89:6232-6. Another example of a recombinase system is the FLP 
recombinase system of Saccharomyces cerevisiae (0*Gorman et al., 1991, Science 
251 : 135 1-5). If a cre/loxP recombinase system is used to regulate expression of the 
traiasgene, animals containing transgenes encodiug both the Cre recombinase and a selected 
protein are required. Such animals can be provided through the construction of "double" 
transgenic animals, e.g., by mating two transgenic animals, one containing a transgene 
encoding a selected protein and the other contaioing a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut et al., 1997, Nature 385:810-3 and PCT 
Publication NOs. WO 97/07668 and WO 97/07669. 



IV. Pharmaceutical Compositions 

The nucleic acid molecules, polypeptides, and antibodies (also referred to herein as 
"active compounds") of the invention can be incorporated into pharmaceutical compositions 
suitable for administration. Such compositions typically comprise the nucleic acid 
molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein 
the language "pharmaceutically acceptable carrier" is intended to include any and all 
solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and 
absorption delaying agents, and the like, compatible with pharmaceutical administration. 
The use of such media and agents for pharmaceutically active substances is well known in 
the art. Except insofar as any conventional media or agent is incompatible with the active 
compound, use thereof in the compositions is contemplated. Supplementary active 
compounds can also be incorporated into the compositions. 

The invention includes methods for preparing phamiaceutical compositions for 
modulating the expression or activity of a polypeptide or nucleic acid of the invention. 
Such methods comprise formulating a pharmaceutically acceptable carrier with an agent 
which modulates expression or activity of a polypeptide or nucleic acid of the invention. 
Such compositions can further include additional active agents. Thus, the invention further 
includes methods for preparing a pharmaceutical composition by formulating a 
pharmaceutically acceptable carrier with an agent which modulates expression or activity of 
a polypeptide or nucleic acid of the invention and one or more additional active compoimds. 
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The agent which modulates expression or activity may, for example, be a small 
molecule* For example, such small molecules include peptides, peptidomimetics, amino 
acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide 
analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic 
compounds) having a molecular weight less thau about 10,000 grams per mole, organic or 
inorgajoic compounds having a molecular weight less than about 5,000 grams per mole, 
organic or inorganic compounds having a molecular weight less than about 1,000 grams per 
mole, organic or inorganic compounds having a molecular weight less than about 500 
grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such 
compounds. It is understood that appropriate doses of small molecule agents depends upon 
a number of factors within the ken of the ordinarily skilled physician, veterinarian, or 
researcher. The dose(s) of the small molecule will vary, for example, depending upon the 
identity, size, and condition of the subject or sample being treated, further depending upon 
the route by which the composition is to be administered, if applicable, and tlie effect which 
the practitioner desires the small molecule to have upon the nucleic acid or polypeptide of 
the invention. Exemplary doses include milligram or microgram amounts of the small 
molecule per kilogram of subject or sample weight (e.g. about 1 microgram per kilogram to 
about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 
milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per 
kilogram. It is furthermore understood that appropriate doses of a small molecule depend 
upon the potency of the small molecule with respect to the expression or activity to be 
modulated. Such appropriate doses may be determined using the assays described herein. 
When one or more of these small molecules is to be administered to an animal (e.g., a 
human) in order to modulate expression or activity of a polypeptide or nucleic acid of the 
invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively 
low dose at first, subsequently increasing the dose until an appropriate response is obtained. 
In addition, it is understood that the specific dose level for any particular animal subject will 
depend upon a variety of factors including the activity of the specific compound employed, 
the age, body weight, general health, gender, and diet of the subject, the time of 
administration, the route of administration, the rate of excretion, any drug combination, and 
the degree of expression or activity to be modulated. 

A pharmaceutical composition of the invention is formulated to be compatible with 
its intended route of administration. Examples of routes of administration include 
parenteral, e.^*., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal 
(topical), transmucosal, and rectal administration. Solutions or suspensions used for 
parenteral, intradermal, or subcutaneous appUcation can include the following components: 
a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, 
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glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as ben2yl 
alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating 
agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or 
phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. 
pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. 
The parenteral preparation can be enclosed in ampules, disposable syringes or multiple dose 
vials made of glass or plastic. 

Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersions. For iatravenous administration, 
suitable carriers iaclude physiological saline, bacteriostatic water, Cremophor ELJ (BASF; 
Parsippany, NJ) or phosphate buffered saline (PBS). In all cases, the composition must be 
sterile and should be fluid to the extent that easy syringability exists. It must be stable 
under the conditions of manufacture and storage and must be preserved against the 
contaminating action of microorganisms such as bacteria and fungi. The carrier can be a 
solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, 
glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable 
mixtures thereof. The proper fluidity can be maintained, for example, by the use of a 
coating such as lecithin, by the maintenance of the required particle size in the case of 
dispersion and by the use of surfactants. Prevention of the action of microorganisms can be 
achieved by various antibacterial and antifungal agentSj» for example, parabens, 
chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be 
preferable to include isotonic agents, for example, sugars, polyalcohols such as mannitol, 
sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable 
compositions can be brought about by including in the composition an agent which delays 
absorption, for example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound 
(e.g.y a polypeptide or antibody) in the required amount in an appropriate solvent with one 
or a combination of ingredients enumerated above, as required, followed by filtered 
sterilization. Generally, dispersions are prepared by incorporating the active compound into 
a sterile vehicle which contains a basic dispersion medium and the required other 
ingredients jBrom those enumerated above. In the case of sterile powders for the preparation 
of sterile iojectable solutions, the preferred methods of preparation are vacuum drying and 
fireeze-drying which yields a powder of the active ingredient plus any additional desired 
ingredient from a previously sterile-filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They can 
be enclosed in gelatin capsules or compressed into tablets. For the purpose of oral 
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therapeutic adrmiiistration, the active compound can be incoiporated with excipients and 
used in the form of tablets, troches, or capsules. Oral compositions can also be prepared 
using a fluid carrier for use as a mouthwash, wherein the compound in the fluid carrier is 
applied orally and swished and expectorated or swallowed. 

Pharmaceutically compatible binding agents, and/or adjuvant materials can be 
included as part of the composition. The tablets, pills, capsules, troches and the like can 
contdn any of the following ingredients, or compoimds of a similar nature: a binder such as 
microcrystalline cellulose, gum tragacaulh or gelatin; an excipient such as starch or lactose, 
a disintegrating agent such as alginic acid, Primogel, or com starch; a lubricant such as 
magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening 
agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl 
salicylate, or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from a pressurized container or dispenser which contains a suitable propellant, 
e,g,i B. gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art, and 
include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid 
derivatives. Transmucosal administration can be accomplished through the use of nasal 
sprays or suppositories. For transdermal administration, the active compounds are 
formulated into ointments, salves, gels, or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories (e.g. , with 
conventional suppository bases such as cocoa butter and other glycerides) or retention 
enemas for rectal delivery. 

In one embodiment, the active compounds are prepared with carriers that will 
protect the compound against rapid elimination from the body, such as a controlled release 
formulation, including implants and microencapsulated delivery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, 
polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation 
of such formulations will be apparent to those skilled in the art. The materials can also be 
obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
suspensions (including liposomes targeted to infected cells with monoclonal antibodies to 
viral antigens) can also be used bs pharmaceutically acceptable carriers. These can be 
prepared according to methods known to those skilled in the art, for example, as described 
in U.S. Patent No. 4,522,811. 



20 



25 



30 



35 



- 90- 



..tBNSDOClD: <W0 0100673A1JB> 



It is especially advantageous to formulate oral or parenteral compositions in dosage 
unit form for ease of Adnainistration and uniformity of dosage. Dosage unit form ^ used 
herein refers to physically discrete units suited as unitary dosages for the subject to be 
treated; each unit containing a predetermined quantity of active compound calculated to 
produce the desired therapeutic effect in association with the required pharmaceutical 
^ carrier. The specification for the dosage unit forms of the invention are dictated by and 
directly dependent on thei unique characteristics of the active compound and the particular 
therapeutic effect to be achieved, and the limitations inherent in the art of compounding 
such an active compound for the treatment of individuals. 

For antibodies, the preferred dosage is 0. 1 mg/kg to 1 00 mg/kg of body weight 
'■^ (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain> a dosage of 50 
mg/kg to 100 mg/kg is usually appropriate. Generally^ partially human antibodies and fully 
human antibodies have a longer half-life within the human body than other antibodies. 
Accordingly, lower dosages and less frequent administration is often possible. 
Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake 
^ ^ and tissue penetration , into the brain). A method for lipidation of antibodies is 
described by Cruikshank et al (1997, J, Acquired Immune Deficiency Syndromes and 
Human Retrovirology 14: 193). 

As defmed herein, a therapeutically effective amount of protein or polypeptide (i.e., 
an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 
0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and 
even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 
6 mg/kg body weigjit 

The skilled artisan will appreciate that certain factors may influence the dosage 
required to effectively treat a subject, including but not limited to the severity of the disease 
or disorder, previous treatments, the general health and/or age of the subject, and other 
diseases present Moreover, treatment of a subject with a therapeutically effective amount 
of a protein, polypeptide, or antibody can include a single treatment or, preferably, can 
include a series of treatments. In a preferred example, a subject is treated with antibody, 
protein, or polypeptide in the range of between about 0.1 to 20 mg/kg body weight, one 
time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more 
preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. 
It will also be appreciated that the effective dosage of antibody, protein, or polypeptide used 
for treatment may increase or decrease over the course of a particular treatment. Changes in 
dosage may result and become apparent from the results of diagnostic assays as described 
herem. 
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The nucleic acid molecules of the invention can be inserted into vectors and used as 
gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, 
intravenous injection, local administration (U.S. Patent 5,328,470) or by stereotactic 
injection (see, e.g., Chen et al, 1994, Proc. Natl Acad. Set USA 91:3054-7). The 
pharmaceutical preparation of the gene therapy vector can include the gene therapy vector 
in an acceptable diluent, or caa comprise a slow release matrix in which the gene deUvery 
vehicle is imbedded. Alternatively, where the complete gene delivery vector can be 
produced intact Jfrom recombinant cells, e,g. retroviral vectors, the pharmaceutical 
preparation can include one or more cells which produce the gene delivery system. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 



V. Uses and Methods of the Invention 

The nucleic acid molecules, proteins, protein homologues, and antibodies described 
herein can be used in one or more of the following methods: a) screening assays; b) 
15 detection assays (e.g*., chromosomal mapping, tissue typing, forensic biology); c) predictive 
medicine {e.g., diagnostic assays, prognostic assays, moidtoring clinical trials, and 
pharmacogenomics); and d) methods of treatment (eg., therapeutic and prophylactic). For 
example, the INTERCEPT 340, MANGO 003, MANGO 347, TANGO 272, TANGO 295, 
TANGO 354, and TANGO 378 polypeptides of the invention can to used to modulate 

OA 

cellular function, survival, motphology, proliferation, and/or differentiation of the cells in 
which they are expressed. For example, the polypeptides ofthe invention can be used to 

treat diseases such as neoplastic disorders (e.g., cancer, tumors), hematopoietic disorders 
(e.g.^ T cell disorders), among others. The isolated nucleic acid molecules of the invention 
can be used to express proteins (e.g., via a recombinant expression vector in a host cell in 
gene therapy applications), to detect mKNA (e.g., in a biological sample) or a genetic 
lesion, and to modulate activity of a polypeptide of the invention. In addition, the 
polypeptides ofthe invention can be used to screen drugs or compounds which modulate 
activity or expression of a polypeptide of the invention as well as to treat disorders 
characterized by insufficient or excessive production of a protein ofthe invention or 

OA 

production of a form of a protein ofthe invention which has decreased or aberrant activity 
compared to the wild type protein. In addition, the antibodies of the invention can be used 
to detect and isolate a protein ofthe invention and modulate activity of a protein ofthe 
invention. 

This invention further pertains to novel agents identified by the above-described 
screening assays and uses thereof for treatments as described herein. 
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A. Screening Assays 

The inveation provides a method (also referred to herein as a "screening assay") for 
identifying modulators, i.e»j candidate or test compounds or agents (e.g,, peptides, 
peptidomimetics, small molecules or other drugs) which bind to polypeptide of the 
invention or have a stimulatory or inhibitory effect on, for example, expression or activity 
^ of a polypeptide of the invention. 

In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-boimd form of a 
polypeptide of the invention or biologically active portion thereof The test compounds of 
the present invention can be obtained using any of the numerous approaches in 
combinatorial library methods known in the art, including: biological libraries; spatially 
addressable parallel solid phase or solution phase libraries; synthetic library methods 
requiring deconvolution; the "one-bead one-compound" library method; and synthetic 
library methods using affinity chromatography selection. The biological library approach is 
limited to peptide libraries, while the otibier four approaches are applicable to peptide, non- 
peptide oligomer or small molecule libraries of compounds (Lam, 1997, Anticancer Drug 
Des. 12:145). 

Examples of methods for the synthesis of molecular libraries can be found in the art, 
for example in: DeWitt et al., 1993, Proc. Natl Acad. Set USA 90:6909; Erb et aL, 1994, 
Proc. Natl Acad. Scu USA 91:11422; Zuckermann et al., 1994, J. Med. Chem. 37:2678; 
Cho etal., 1993, iSczewc^ 261:1303; Carrelletal., l994,Angew. Chem, Int. Ed. Engl 
33:2059; Carell et al., 1994, Angew. Chem. Int. Ed. Engl 33:2061; and Gallop et al., 1994, 
J.Med.Chem.31\mZ. 

Libraries of compounds may be presented in solution {e.g., Houghten, 1992, 
Bio/Techniques 13:412-21), or on beads (Lam, 1991, Nature 354:82-4), chips (Fodor, 1993, 
Nature 364:555-6), bacteria (U.S. Patent No. 5,223,409), spores (U.S. Patent NOs. 
5,571,698; 5,403,484; and 5,223,409), plasmids (Cull et al, 1992, Proc. Natl Acad. Scl 
USA 89:1865-9) or phage (Scott and Smith, 1990, Science 249:386-90; Devlin, 1990, 
Science 249:404-6; Cwkla et al., 1990, Proc. Natl Acad. Set USA 87:6378-82; andFelici, 
1991, J. Mol Biol 222:301-10). 

In one embodiment, an assay is a cell-based assay in which a cell which expresses a 
membrane-bound form of a polypeptide of the invention, or a biologically active portion 
thereof, on the cell surface is contacted with a test compound and the ability of the test 
compound to bind to the polypeptide determined. The cell, for example, can be a yeast cell 
or a cell of mammalian origin. Determining the ability of the test compound to bind to the 
polypeptide can be accomplished, for example, by coupling the test compound with a 
radioisotope or en2ymatic label such that binding of the test compoimd to the polypeptide or 
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biologically active portion thereof can be determined by detecting llie labeled compound in 
a complex. For example, test compounds can be labeled witii ^^S, ^"^C, or ^H, either 
directly or indirectly, and the radioisotope detected by direct counting of radioemmission or 
by scintillation counting. Alternatively, test compounds can be enzymatically labeled with, 
for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic 
^ label detected by determination of conversion of an appropriate substrate to product. In a 
preferred embodiment, the assay comprises contacting a cell which expresses a membrane- 
bound form of a polypeptide of the invention, or a biologically active portion thereof, on the 
cell surface with a known compound which binds the polypeptide to form an assay mixture, 
contacting the assay mixture with a test compound, and determining tlie ability of the test 
compound to interact with the polypeptide, wherein determining the ability of the test 
compound to interact with the polypeptide comprises determining the ability of the test 
compound to preferentially bind to the polypeptide or a biologically active portion thereof 
as compared to the known compound. 

In another embodiment, the assay involves assessment of an activity characteristic of 
the polypeptide, wherein binding of the test compound with the polypeptide or a 
biologically active portion thereof alters (eg-., increases or decreases) the activity of the 
polypeptide. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 
expressing a membrane-bound form of a polypeptide of the invention, or a biologically 
active portion thereof, on the cell surface with a test compound and determining the ability 
of the test compound to modulate (e.^., stimukte or inhibit) the activity of the polypeptide 
or biologically active portion thereof. Determining the ability of the test compoimd to 
modulate the activity of the polypeptide or a biologically active portion thereof can be 
accomplished, for example, by determining the ability of the polypeptide protein to bind to 
^•^ or interact with a target molecule or to transport molecules across the cytoplasmic 
membrane. 

Determining the ability of a polypeptide of the invention to bind to or interact witibi a 
target molecule can be accomplished by one of the methods described above for 
determining direct binding. As used herein, a "target molecule" is a molecule with which a 
selected polypeptide (e.g., a polypeptide of the invention binds or interacts with in nature, 
for example, a molecule on the surface of a cell which expresses the selected protein, a 
molecule on the surface of a second cell, a molecule in the extracellular miUeu, a molecule 
associated with the internal surface of a cell membrane or a cytoplasmic molecule. A target 
molecule can be a polypeptide of the invention or some other polypeptide or protein. For 
example, a target molecule can be a component of a signal transduction pathway which 
facilitates transduction of an extracellular signal (eg. , a signal generated by binding of a 



compound to a polypeptide of the invention) through the cell membrane and into the cell or 
a second intercellular protein which has catalytic activity or a protein which facilitates the 
association of downstream signaling molecules with a polypeptide of the invention. 
Determining the ability of a polypeptide of the invention to bind to or interact with a target 
molecule can be accomplished by determining the activity of the target molecule* For 
^ example, the activity ofthe target molecule can be determined by detecting induction of a 
cellular second messenger of the target (e,g., intracellular Ca^"**, diacylglycerol, IP3, etc.), 
detecting catalytic/enzymatic activity of the target on an appropriate substrate, detecting the 
induction of a reporter gene a regulatory element that is responsive to a polypeptide of 
the invention operably linked to a nucleic acid encoding a detectable marker, e.g. 
luciferase), or detecting a cellular response, for example, cellular differentiation, or cell 
proliferation. 

In yet another embodiment, an assay of the present invention is a cell-jfree assay 
comprising contacting a polypeptide of the invention or biologically active portion thereof 
with a test compound and determining the ability of the test compotmd to bind to the 
polypeptide or biologically active portion thereof Binding ofthe test compound to the 
polypeptide can be determined either directly or indirectly as described above. In a 
preferred embodiment, the assay includes contacting the polypeptide of the invention or 
' biologically active portion thereof with a known compound which binds the polypeptide to 
form an assay mixture, contacting the assay mixture with a test compound, and determining 
the ability ofthe test compound to interact with the polypeptide, wherein determining the 
ability of tiie test compound to interact with the polypeptide comprises determining the 
ability of the test compoimd to preferentially bind to the polypeptide or biologically active 
portion thereof as compared to the known compound. 

In another embodiment, aii assay is a cell-free assay comprising contacting a 
polypeptide of the invention or biologically active portion thereof with a test compound and 
determining the ability of the test compound to modulate (e.g,y stimulate or inhibit) the 
activity of the polypeptide or biologically active portion thereof. Determining the ability of 
the test compotmd to modulate the activity ofthe polypeptide can be accomplished, for 
example, by determhiing the ability of the polypeptide to bind to a target molecule by one 
ofthe methods described above for determining direct binding. In an alternative 
embodiment, determining the ability of the test compound to modulate the activity of the 
polypeptide can be accomplished by determining the ability of the polypeptide ofthe 
invention to further modulate the target molecule. For example, the catalytic/enzymatic 
activity of the target molecule on an appropriate substrate can be determined as previously 
described. 
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In yet another embodiment, the cell-free assay comprises contacting a polypeptide of 
the invention or biologically active portion thereof with a known compound which binds the 
polypeptide to form an assay mixture, contacting the assay mixture with a test compound, 
and determining the ability of the test compound to interact with the polypeptide, wherein 
determining the ability of the test compound to interact with the polypeptide comprises 
^ determining the ability of the polypeptide to preferentially bind to or modulate the activity 
of a target molecule. 

The cell-free assays of the present invention are amenable t6 use of both a soluble 
form or the membrane-bound form of a polypeptide of the invention. In the case of cell-free 
assays comprising the membrane-bound form of the polypeptide, it may be desirable to 
utilize a solubilizing agent such that the membrane-bound form of the polypeptide is 
maintained in solution. Examples of such solubilizing agents include non-ionic detergents 
such as n-octylglucoside, n-dodecylglucoside, n-octyknaltoside, octanoyl-N- 
methylglucamide, decanoyl-N-methylglucamide, Triton Triton X-114, Thesit, 

Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-l- 
propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-l" 
propane sulfonate (CHAPSO), orN-dodecyl=N,N-dimethyl-3-ammonio-l-proipane 
sulfonate. 

In more tlian one embodiment of the above assay methods of the present invention, 

it may be desirable to immobilize either the polypeptide of the invention or its target 

molecule to faciUtate separation of complexed from uncomplexed forms of one or both of 

the proteins, as well as to accommodate automation of the assay. Binding of a test 

compound to the polypeptide, or interaction of the polypeptide with a target molecule in the 

presence and absence of a candidate compound, can be accompUshed in any vessel suitable 

for containing the reactants* Examples of such vessels include microtiter plates, test tubes, 
25 • • iT J 

and micro-centrifixge tubes. In one embodiment, a fusion protein can be provided which 

adds a domain that allows one or both of tihe proteins to be bound to a matrix. For example, 

glutathione-S-transferase ftision proteins or glutathione-S-transferase fusion proteins can be 

adsorbed onto glutathione sepharose beads (Sigma Chemical; St. Louis, MO) or glutathione 

derivatized microtiter plates, which are then combined with the test compound or the test 
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compound and either the non-adsorbed target protein or A polypeptide of the invention, and 
the mixture incubated under conditions conducive to complex formation (e.g., at 
physiological conditions for salt and pH). Following incubation, the beads or microtiter 
plate wells are washed to remove any imbound components and complex formation is 
measured either directly or indirectly, for example, as described above. Alternatively, the 
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complexes can be dissociated from the matrix, and the level of binding or activity of the 
polypeptide of the invention can be determined using standard techniques. 
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Oilier techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either the polypeptide of the invention or 
its target molecule can be immobilized utilizing conjugation of biotin and streptavidin. 
Biotinylated polypeptide of the invention or target molecules can be prepared &om biotin- 
NHS (N-hydroxy-succiniinide) using techniques well known in the art (e.g., biotinylation 
kit, Pierce Chemicals; Rockford, DL), and immobilized in the wells of streptavidin-coated 96 
well plates (Pierce Chemical). Alternatively, antibodies reactive with the polypeptide of the 
invention or target molecules but which do not interfere with binding of the polypeptide of 
the invention to its target molecule can be derivatized to the wells of the plate, and unbound 
target or polypeptide of the invention trapped in the wells by antibody conjugation^ 
Methods for detecting such complexes, in addition to those described above for the GST- 
immobilized complexes, include immunodetection of complexes using antibodies reactive 
with the polypeptide of the invention or target molecule, as well as enzyme-linked assays 
which rely on detecting an enzymatic activity associated with the polypeptide of the 
invention or target molecule. 

In another embodiment, modulators of e3q)ression of a polypeptide of the invention 
are identified in a method in which a cell is contacted with a candidate compound and the 
expression of the selected mUNA or protein (i.e., the mKNA or protein corresponding to a 
polypeptide or nucleic acid of the invention) in the cell is determined. The level of 
expression of the selected mKNA or protein in the presence of the candidate compound is 
compared to the level of expression of the selected mRNA or protein in the absence of the 
candidate compound. The candidate compound can then be identified as a modulator of 
expression of the polypqptide of the invention based on this comparison. For example, 
when expression of the selected mRNA or protein is greater (statistically significantly 
greater) in the presence of the candidate compound than in its absence, the candidate 
compound is identified as a stimulator of the selected mRNA or protein expression. 
Alternatively, when expression of the selected mRNA or protein is less (statistically 
significantly less) in the presence of the candidate compound than in its absence, the 
candidate compound is identified as an inhibitor of the selected mRNA or protein 
expression. The level of the selected mRNA or protein expression in the cells can be 
determined by methods described herein. 

hi yet another aspect of the invention, a pol3/peptide of the inventions can be used as 
**bait proteins" in a two-hybrid assay or three hybrid assay (see^ e,g,, U.S. Patent No. 
5,283,317; Zervos et al., 1993, Cell 72:223-32; Madura et al., 1993, X Biol Ghent. 
268:12046-54; Bartel iet al., 1993, Bio/Techniques 14:920-4; Iwabuchi et al., 1993, 
Oncogene 8:1693-6; and PCT Publication No. WO 94/10300), to identify other proteins, 
which bind to or interact with the polypeptide of the invention and modulate activity of the 
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polypeptide of the invention. Such binding proteins are also likely to be involved in the 
propagation of signals by the polypeptide of the inventions as, for example, upstream or 
downslream elements of a signaling pathway involving the polypeptide of the invention. 

This invention further pertains to novel agents identified by the above-described 
screening assays and uses thereof for treatments as described herein. 

B. Detection Assavs 

Portions or fragments of the cDNA sequences identified herein (and the 
corresponding complete gene sequences) can be used in numerous ways as polynucleotide 
reagents. For example, these sequences can be used to: (i) map their respective genes on a 
chromosome and, thus, locate gene regions associated with genetic disease; (ii) identify an 
individual from a ininute biological sample (tissue typing); and (iii) aid in forensic 
identification of a biological sample. These apphcations are described in the subsections 
below. 



1. Chromosom e Mapping 

Once the sequence (or a portion of the sequence) of a gene has been isolated, this 
sequence can be used to map the location of the gene on a chromosome. Accordingly, 
nucleic acid molecules described herein or fragments thereof, can be used to map the 
location of the corresponding genes on a chromosome. The mapping of the sequences to 
chromosomes is an important first step in correlating these sequences with genes associated 
with disease. 

Briefly, genes can be mapped to chromosomes by preparing PGR primers 
(preferably 15-25 bp in length) from the sequence of a gene of the invention. Computer 
analysis of the sequence of a gene of the invention can be used to rapidly select primers that 
do not span more than one exon in the genomic DNA, thus complicating the amplification 
process. These primers can then be used for PGR screening of somatic cell hybrids 
containing individual human chromosomes. Only those hybrids contaimng the hxmian gene 
corresponding to the gene sequences will yield an amplified fragment. For a review of this 
technique, see D'Eustachio et aL (1983, Science 220:919-24). 

PGR mapping of somatic cell hybrids is a rapid procedm-e for assigning a particular 
sequence to a particular chromosome. Three or more sequences can be assigned per day 
using a single thermal cycler. Using the nucleic acid sequences of the invention to design 
oligonucleotide primers, sublocaKzation can be achieved with panels of fragments from 
specific chromosomes. Other mapping strategies which can similarly be used to map a gene 
to its chromosome include in situ hybridization (described in Fan et al., 1990, Proc, Natl 
Acad, Set USA 87:6223-7), pre-screening with labeled flow-sorted chromosomes (CITE), 
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and pre-selection by hybridization to chromosome specific cDNA libraries. Fluorescence in 
situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can 
further be used to provide a precise chromosomal location in one step. For a review of this 
technique, see Verma et aL, Human Chromosomes: A Manual of Basic Techniques, 1988, 
Pergamon Press, NY, 

^ Reagents for chromosome mapping can be used individually to mark a single 

chromosome or a single site on that chromosome, or panels of reagents can be used for 
marking multiple sites and/or multiple chromosomes. Reagents corresponding to 
noncoding regions of the genes actually are preferred for mapping purposes. Coding 
sequences are more likely to be conserved within gene families, thus increasing the chance 
of cross hybridizations during chromosomal mapping. 

Once a sequence has been mapped to a precise chromosomal location, the physical 
position of the sequence on the chromosome can be correlated with genetic map data. 
(Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, 
available on-line through Johns Hopkins University Welch Medical Library). The 
relationship between genes and disease, mapped to the same chromosomal region, can then 
be identified through linkage analysis (co-inheritance of physically adjacent genes), 
described in, e,g., Egeland et al., 1987, Nature 325:783-7. 

Moreover, differences in the DNA sequences between individuals affected and 
unaffected with a disease associated with a gene of the invention can be determined. If a 
mutation is observed in some ot all of the affected individuals but not in any unaffected 
individuals, then the mutation is likely to be the causative agent of the particular disease. 
Comparison of affected and unaffected individuals generally involves first looking for 
structural alterations in the chromosomes such as deletions or translocations that are visible 
fi*om chromosome spreads or detectable using PCR based on that DNA sequence. 

rye 

Ultimately, complete sequencing of genes firom several individuals can be performed to 
confirm the presence of a mutation and to distinguish mutations from polymorphisms* 

Furthermore, the nucleic acid sequences disclosed herein can be used to perform 
searches against "mapping databases", e.g.> BLAST-type search, such that the chromosome 
position of the gene is identified by sequence homology or identity with known sequence 
fragments which have been mapped to chromosomes. 

A polypeptide and fragments and sequences thereof and antibodies specific thereto 
can be used to map the location of the gene encoding the polypeptide on a chromosome. 
This mapping can be carried out by specifically detecting the presence of the polypeptide in 
members of a panel of somatic cell hybrids between cells of a first species of animal from 
which the protein originates and cells from a second species of animal and then determining 
which somatic cell hybrid(s) expresses the polypeptide and noting the chromosome(s) from 
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the first species of ardmal that it contains. For examples of this technique, see Pajnnen et 
al., 1988, Cytogenet Cell Genet 47:37-41 and Van Keiaren et aL, 1986, i^wm. Genet 
74:34-40. Alternatively, the presence of the polypeptide in the somatic cell hybrids can be 
determined by assaying an activity or property of the polypeptide, for example, enzymatic 
activity, as described ui Bordelon-Riser et al, 1979, Somatic Cell Genetics 5:597-613 and 
^ Owerbach et al., 1978, Proc. Natl Acad, Set USA 75:5640-5644. 

2. Tissue Typing 

The nucleic acid sequences of the present invention can also be used to identify 
individuals from minute biological samples. The United States military, for example, is 

^ ^ considering the use of restriction fragment length polymorphism (RFLP) for identification 
of its personnel. In this technique, an individuaFs genomic DNA is digested vdth one or 
more restriction enzymes, and probed on a Southern blot to yield imique bands for 
identification. This method does not suffer from the current limitations of "Dog Tags" 
which can be lost, switched, or stolen, making positive identification difficult. The 

^ ^ sequences of the present invention are useful as additional DNA markers for RPLP 
(described in U.S. Patent 5,272,057). 

Furthermore, the sequences of the present uavention can be used to provide an 
alternative technique which determines the actual base-by-base DNA sequence of selected 
portions of an individual's genome. Thus, the nucleic acid sequences described herein can 
be used to prepare two PGR primers from the 5' and 3' ends of the sequences* These 
primers can then be used to amplify an individual's DNA and subsequently sequmce it. 

Panels of corresponding DNA sequences from individuals, prepared in this manner, 
can provide unique individual identifications, as each individual will have a unique set of 
such DNA sequences due to allelic differences. The sequences of the present invention can 
be used to obtain such identification sequences from individuals and from tissue. The 
nucleic acid sequences of the invention uniquely represent portions of the human genome. 
Allelic variation occurs to some degree in the coding regions of these sequences, and to a 
greater degree in the noncoding regions. It is estimated that allelic variation between 
individual humans occurs with a frequency of about once per each 500 bases. Each of the 
sequences described hereki can, to some degree, be used as a standard against which DNA 
from an individual can be compared for identification purposes. Because greater numbers 
of polymorphisms occur in the noncoding regions, fewer sequences are necessary to 
differentiate individuals. The noncoding sequences of SEQ ID NOs:l, 4, 7, 10, 13, 16, 19, 
22, 25, and 28 can comfortably provide positive individual identification with a panel of 
perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. 
If predicted coding sequences, such as those in SEQ ID N0s:3, 6, 9, 12, 15, 18, 21, 24, 27, 
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and 30 are used, a more appropriate nuinber of primers for positive individual identification 
would be 500-2,000. 

If a panel of reagents from the nucleic acid sequences described herein is used to 
generate a uiuque identification database for an individual, those same reagents can later be 
used to identify tissue from that individual. Usitig the unique identification database, 
positive identification of the individual, living or dead, can be made from extremely small 
tissue samples. 

3. Use of Partial Gene Sequences in Forensic Biologv 

DNA-based identification techniques can also be used in forensic biology. Forensic 
biology is a scientific field employing genetic typing of biological evidence found at a 
crime scene as a means for positively identifying, for example, a perpetrator of a crime. To 
make such an identification, PGR technology can be used to amplify DNA sequences taken 
from very small biological samples such as tissues, hair or skin, or body fluids, e.g., 
blood, saliva, or semen found at a crime scene. The amplified sequence can then be 
compared to a standard, thereby allowing identification of the origin of the biological 
sample. 

The sequences of the present invention can.be used to provide polynucleotide 
reagents, e,g,, PGR primers, targeted to specific loci in the human genome, which can 
enhance the reliabiUfy of DNA-based forensic identifications by, for example, providing 
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another "identification marker" (i.e. another DNA sequence that is miique to a particular 
individual). As mentioned above, actual base sequence information can be used for 
identification as an accurate alternative to patterns formed by restriction en2yme generated 
fragments. Sequences targeted to noncoding regions are particularly appropriate for this use 
as greater numbers of polymoiphisms occur in the noncoding regions, making it easier to 
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differentiate individuals using this teclmique. Examples of polynucleotide reagents include 
the nucleic acid sequences of the invention or portions thereof, e.g., fragments derived from 
noncoding regions having a length of at least 20 or 30 bases. 

The nucleic acid sequences described herein can flirther be used to provide 
polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, 
an in situ hybridization technique, to identify a specific tissue, e.g., brain tissue. This can 
be very useflil in cases where a forensic pathologist is presented with a tissue of imknown 
origin. Panels of such probes can be used to identify tissue by species and/or by organ type. 
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G. Predictive Medicine: 

The present invention also pertains to the field of predictive medicine in which 
diagnostic assays, prognostic assays, and monitoring clmical trials are used for prognostic 
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